Synchronizing Files On Multiple Computers Under Linux

I am a Knowledge Geek.  I like collecting knowledge, searching knowledge, and organizing knowledge.  If you read my post on [post=”give-me-liberty” text=”my recent cellphone research”], you know what I mean.  I take notes at most meetings and conference I go to.  That often turns out to be very advantageous, especially on longer-term projects where it could be important to find out when a particular decision was made, by who, and why.  I’ve always named the files staring with the group/project/company name, followed by a YYYYMMDD timestamp, and optionally a topic after that, so finding things isn’t too hard.  The larger problem I started facing recently though, is I have been taking notes on multiple computers.  I needed a way of making sure I had access to at least some of these notes when and where I need them.

Up until fairly recently, I had my server and my laptop. All notes were taken on the laptop (and backed up to external USB drive).  Then I got an iPhone, and found note-taking on that quite practical (using QuickOffice).  Then I got a netbook (Dell Mini 10), and started using that for meetings (after I got the netbook, I didn’t use the iPhone for note-taking very much).  The end result was these meeting notes were not where I needed them.  I needed a way of synchronizing these notes between computers.

All three machines run Linux (of course).  The usual tool to transfer files between *NIX boxes in a way that can be automated is scp, which is like running cp over ssh (hence the name). It works well, but it’s got even fewer options than cp.
Next up is rsync. Rsync can use several different transport mechanism (including ssh), and is a lot more flexible on the file selection and transfer side too. I tried rsync, but it’s main limitation is that it’s one-way. It can copy new files and update changed files, but only from a source to a destination. One would have to run rsync in both directions to do that, and it can be problematic due to time differences between machines.

Further research turned up unison.  Unison is like rsync, but bidirectional.  It has many other strengths:

  • It lends itself to greater automation because it is designed to synchronize any number of directories in one call, as specified by a configuration file.  One can have multiple configuration files, and specify which one to use on the command line.
  • It can be run interactively to verify it’s proposed actions, or completely automated
  • It can make backup copies of files.  It can even keep a set number of backups.
  • In the case that a file has changed in more than one place, it can copy one over the other, or attempt to merge the files.  In interactive mode, it can display the differences between the files before merging or overwriting
  • It can be used to synchronize multiple machines by configuring all machines to synchronize with the same machine (though it must be executed multiple times for this to work)
  • Most command-line options can also be specified in the configuration file
  • It is actively maintained (a very important attribute of an open source project)
  • It comes with excellent documentation

The one restriction that may not work in all environments is that unison  expects to mirror the same directory structure on both machines (under different roots, of course).  This is evident in the structure of the configuration file.  Here’s mine:

# Unison preferences file to sync notes with Janus
root = /home/david/
root = ssh://david@janus//home/david/
path=Documents/notes
path=Documents/queue
path=Documents/research
path=.jedit/macros

This configuration file (which is identical on lexa, my laptop, and minime, my netbook) syncs the important documents (and my jEdit macros) under the specified directories with an identical directory structure on janus, one of the servers I run.  Since both lexa and minime sync with janus, lexa and minime get synced with each other.    Having to run unison multiple times to accomplish this is not a problem for me, since I’ve established the process of syncing whenever I am going to take a laptop out, and again when I bring it back from taking notes.  Missing the sync before I leave is not a problem, though, since I can sync with janus from anywhere with an internet connection.

Share

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.