Blogtastic

Mike Roberts on Life and Technology


Wikis and Source Control (with Fitnesse, Subversion and CruiseControl.NET)

Created 30 December 2005

[ Tech / Tools ]

Wiki's are a great tool for having loosely structured documentation that a whole team can update. They offer a very low barrier to entry both in terms of learning how to use them, and in terms of infrastructure (no software to install, no shared document strategy to invent, that kind of thing.)

One problem with Wiki's though are that they typically sit outside of a team's source control environment. This is bad because teams like the versioning functionality source control offers and also because it means you have another repository of knowledge that needs to be backed up. Typically I've just made do with these problems but on my current project we are using a Wiki to host our automated acceptance test scripts. We're using Fitnesse, but that doesn't really matter for the sake of this article. What does matter though is that we definitely want our automated acceptance tests under source control since they are closely tied to the state of our source code, so we need to figure out how to get our Wiki working with source control.

Our approach to solving this is as follows:


  • We check our entire wiki (application and data) into a single folder in source control. Since our acceptance tests are tied to the source code, we check the wiki into the same branch as the source code.

  • Developers and QAs who need to edit the acceptance tests do so on a locally checked-out version of the wiki. They run the wiki app locally and edit it through the web-app as normal, just on their local version.

  • The application build script refers to the acceptance tests defined in the wiki. Once any acceptance test changes are complete, a local build is run and then the wiki updates are checked in to source control.

  • At that point a build is kicked off on the Continuous Integration server. It updates both its local copy of source code and wiki files, validating the current state of source and tests.

OK, so far so good: we define our Acceptance Tests in a wiki, and it fits into source control and our automated build in the normal way. As a nice side effect people can read and edit the Wiki offline. But we've lost something. We said at the top that there's a low infrastructural barrier to entry in using a Wiki, and by introducing a source control environment that's no longer the case. Is there a way we can have the ease of use of a 'shared' wiki but still keep it in source control? Yes, and here's how we do it:


  • Checkout the wiki from source control to a machine, and run the wiki on this machine as a shared wiki instance.

  • Install a Continuous Integration (CI) tool on the same machine that is hosting your shared Wiki instance.

  • Setup a CI project that monitors for changes both the local file system where your wiki files are stored and the folder in source control where these changes are checked-in.

  • If changes occur in either of these locations, synchronise the wiki data by both updating from the source control server and committing any local changes (that themselves were made by people using the shared wiki instance.)

There are a few gotchas though to be aware of:


  • You need to use a stateless wiki. If your wiki caches edits and/or data read from the filesystem you'll need to figure out a way of stopping it doing this, or a way of flushing its caches. Fitnesse thankfully is a stateless wiki so we didn't have this problem.

  • Some wiki's can be quite 'chatty' about what they save to the file system. The version of Fitnesse that we are using out of the box edits one file as soon as anyone even looks at a page, which isn't great for use with source control. Many wiki's also have their own in-built source control system of sorts, which we no longer need. In short, take time figuring out how to turn off various features of your wiki that aren't required in a source controlled environment.

  • Conflicts between users are going to be an issue, and we are still figuring out on our team a complete set of practices. The biggest tool for this is to make sure people using locally checked-out copies of the wiki update and commit very often and that the time between updates of the shared wiki is short (we have our's set to 15 seconds.) We also have a rule that any wiki refactorings (e.g. page renames) are only done on locally checked out versions, not on the shared instance.

  • The 'Recent Changes' file is always becoming conflicted but since it is just a generated page we tend to let 'local version win' in this case. We'll probably write some small batch scripts that make things easier for people.

  • Changes to Acceptance Test Fit Tables are only made on local wikis since they are part of the project's buildable source files and so people should be running these changes locally before checking in. Similarly, Fit tests can't actually be executed on the shared wiki - we decided it was too unclear what version of the code should actually be executed when tests were run in such a way.

Right, less talk, more code! :) How did we actually implement all of this? In terms of environments we are using Subversion for source control, Fitnesse as our Wiki and CruiseControl.NET as our Wiki Update tool. For Fitnesse we updated the launch script as follows:


"%JAVA_HOME%\bin\java" -cp fitnesse.jar fitnesse.FitNesse -o -e 0 -p 8888

The -o and -e 0 options suppress unnecessary file updates and in-built source control. We also deleted all the .zip files which already existed (these are just for source control).

For the CruiseControl.NET project we have the following configuration (this is for CCNet 1.0) :


<project name="Wiki Sync">
<workingDirectory>c:\sourcecontrol\wiki-trunk\wiki</workingDirectory>
<triggers>
<intervalTrigger seconds="15" />
</triggers>
<sourcecontrol type="multi">
<sourceControls>
<svn>
<trunkUrl>svn://oursvnserver/ourproject/trunk/wiki</trunkUrl>
<workingDirectory>c:\sourcecontrol\wiki-trunk</workingDirectory>
<autoGetSource>true</autoGetSource>
</svn>
<filtered>
<sourceControlProvider type="filesystem">
<repositoryRoot>c:\sourcecontrol\wiki-trunk\wiki</repositoryRoot>
</sourceControlProvider>
<exclusionFilters>
<pathFilter><pattern>**\.svn\*</pattern></pathFilter>
</exclusionFilters>
</filtered>
</sourceControls>
</sourcecontrol>
<tasks>
<exec executable="sync.cmd" />
</tasks>
</project>

This is a little complicated, but basically it means that 'sync.cmd' is called if changes occur in our subversion copy of the wiki, or on the local version, but ignoring any of subversion's own local files. Fitnesse is actually run as a service from c:\sourcecontrol\wiki-trunk\wiki .

sync.cmd is as follows:


@echo off

rem mark as removed any files that have been deleted
for /f "usebackq tokens=2" %%i in (`"svn st | findstr !"`) do svn rm %%i

rem add new files
svn add --force *.*

rem commit - this won't do anything if nothing is to be committed
svn commit -m "This is an automated Wiki commit"

The 'svn update' is done before all of this by the source control provider in CCNet. We may well update this to automatically handle conflicts (at the moment we have to do it manually by logging into the wiki server.)

OK, lets wrap this up. Wikis are great, writing acceptance tests in a wiki is also great, but acceptance tests should be in source control so we've put our wiki into source control. Through a bit of CI hackery we still have a 'shared' wiki that anyone can edit without having to use source control. It really is a hack though - it would be far cleaner if the shared wiki was actually persisting and reading directly to and from source control.

Kudos to Jeffrey Palermo - he's already done most of this, blogged it and so provided the start to the work we did.


Automated SSH with passwords

Created 23 August 2005

[ Tech / Tools ]

A few months ago, I talked about setting up automated Subversion access using SSH. This is especially important if you are using an automated build server. (Couldn't resist the shameless plug :) )

One requirement of that discussion was that you needed to be using key-based authentication for your SSH access. So what if you're not using keys? This is exactly the situation that has arisen this week with adding a new project to CCNetLive . We want to build a new project, from a new Subversion server, using SSH and password-based authentication, without messing up the SSH configuration for the existing projects on the machine (so no project-specific machine-global settings are allowed.) So how to do this?

I went through various attempts at doing this before realising how easy Subversion makes it!

First of all, make sure your new SSH connection is working correctly. In Windows, this means using Putty to connect using your SSH user name and password, and saving the server's key. This is a vital step otherwise your connection will hang later as in the background it will be asking you to confirm the identity of the server.

Next, find your user's Subversion config file. On Windows, this is normally in something like C:\Windows\Documents And Settings\Your UserName\Application Data\Subversion\. Find the [tunnels] section, and add a line something like:


myprojectssh = c:\\tools\\putty\\plink.exe -l YourSSHUser -pw YourSSHPassword


The myprojectssh is the name of your Subversion scheme and you can use this scheme instead of the normal ssh scheme, so you would use a command something like svn checkout svn+myprojectssh://mysvnhost.com/my/project/root . Notice you don't need to re-specify your user name. Obviously, you should change myprojectssh, YourSSHUser and YourSSHPassword for your setup, as well as the location of plink. The double back-slashes are important - check the note that should be in your Subversion config file for more details.

This Subversion scheme works because the whole -l abc -pw xyz part gets passed through to Plink, and plink understands what -l and -pw mean. If your command line SSH client uses different parameters for users and passwords you should substitute them as necessary.

There's a couple of things to note with all this. Firstly, your SSH credentials are being stored unencrypted in a text file on your machine, so you should make sure your Subversion config file is secured somehow. It may be enough to make sure its only visible by the individual user, but you might also want to consider using an encrypted disk. Secondly, this solution should only be used where you can't use SSH keys for some reason. Key-based SSH authentication is a far better option, security wise, than password-based authentication.


Automated SSH authentication on Windows

Created 03 March 2005

[ Tech / Tools ]

I use a few remote UNIX servers. Some host web content, some are Source Control repositories. All of them I access using SSH either for an interactive shell, or as a tunnel for applications like Subversion, CVS or rsync.

A few months ago when I started committing to projects on Codehaus I had to setup a SSH key-pair since they don't allow plain password authentication for their SSH server. This was actually good since I'd been meaning to switch to key-based authentication for a while but hadn't quite got around to it. The main reason for using key-pairs is the extra security - you have to 'bring something' (your private key) as well as just 'know something' (either a password, or the passphrase of your private key). However there is an added benefit in using key-pairs in that once you have set them up once in any one 'session' you don't have to keep re-entering a password. (A session here is usually a Windows, or X-Windows, login session.)

It took me a while to get all of that setup though. I wanted to use Putty since it was before I'd started using the command line in anger. Putty actually makes this kind of thing pretty easy through using its Plink (RSH implementation), PuTTYgen (Key generator) and Pageant (key authentication agent) programs, but the problem was around key formats. Putty by default saves you a public key that won't work on an OpenSSH remote server. After some head banging and half an hour with a friendly CodeHaus Despot or two on an IRC channel and we managed to get it working. The key (haha!) was to do the following in PuTTYgen:


  • Use SSH2 DSA keys

  • Don't use the public key file that is saved, but instead use the contents of the box at the top of the window. Yes, the one that says 'Public key for pasting into OpenSSH authorized_keys file' that I should have used straight away :)


Then it was just a matter of setting up a saved Putty session (including my user name) and adding my key to Pageant. You do have to remember to try to use Putty for an interactive login the first time you connect to a server so that it can save a copy of the server's key locally.

Using Plink works fine for command line Subversion (see my earlier post for my [tunnels] setup), but today I hit a problem using it with rsync. Cygwin's rsync seems to want to use Cygwin's ssh, and Plink just doesn't seem to play ball. 'No problem', I thought, 'UNIX must have an equivalent of Pageant'. Indeed it does - its called ssh-agent. Using this helpful page I found the required incantation, but hit a problem in that it wouldn't accept the passphrase on my private key. After a couple of minutes I realised that it was another formatting problem which PuTTYgen could solve for me. All up then, being able to use my Putty-generated private key on Cygwin required the following steps:


  • Load private key in PuTTYgen

  • From the Conversions menu, select Export OpenSSH key

  • Save it as a file called id_dsa in the ~/.ssh directory. On my machine that is equivalent to c:\Documents and Settings\mroberts\.ssh\id_dsa

  • Add the following to my ~/.profile file : alias startssh="eval \`ssh-agent\` ; ssh-add"

  • Add the following to my ~/.profile file : alias stopssh="ssh-agent -k"


Now I just run startssh and stopssh around any times I want to do some rsync work. Its not perfect since right now I need to startup ssh-agent for every Cygwin prompt, and I also need to stop it before I exit the prompt otherwise the window will hang. There's probably some hackery that can be done using a Windows Service, but I'll save that investigation for another day.


On Subversion

Created 01 March 2005

[ Tech / Tools ]

Updated 04 March 2005

I've recently started using the Subversion Source Control system a lot more. It really has got pretty stable now and I feel happy about telling people that they should consider migrating to Subversion from CVS (or in fact pretty much any other Source Control tool) as soon as they can. Atomic checkins, offline features, scalibility, useful and easy tags and branching, and lots of other good stuff all make it a compelling tool.

One thing that has helped Subversion recently is the improved documentation available. An example of this is Mike Mason's new book Pragmatic Version Control Using Subversion. I have to say right now that I'm an old friend and colleague of Mike's and also reviewed the book, so I am completely biased, but I think its a brilliant tutorial to getting started with Subversion, and a handy reference to have around.

Mike's also recently helped me out with a couple of things that weren't in the book, so I wanted to share them here.

Firstly, if you've been following my blog recently you'll know I've been using Cygwin a lot more. One of the things that Mike does very well is show how powerful and easy the svn command line client is. I was using the standard Windows download of Subversion with Cygwin, but apparently this is a really bad idea. Apart from not using the right kind of paths, you can apparently also get file corruption.

Anyway, this is easy to change - open up the Cygwin installer and choose to use its version of Subversion - its in the devel folder. Cygwin will automatically update your path so that your Cygwin version of svn is used rather than the previous one you installed. You'll also need to repeat any changes you made to your Subversion config file - the Cygwin version will appear in the ~/.subversion/ directory. If you've already setup Putty, Plink and Pageant to manage your SSH identity this will still work - my config file has one change to support this:


[tunnels]
ssh = $SVN_SSH /c/tools/putty/PLINK.EXE


The second thing is to do with deletes. During the course of a development episode, I might well delete some old files. When I perform a svn status I will get a bunch of files showing as ! meaning that they have been removed by something other than a Subversion request. In fact I'll get output a bit like this:


$ svn st
A src/Core/Generators/NewGenerator.cs
! src/Core/Generators/StandardDotNetUpperCaseGuidGenerator.cs
! src/Core/Generators/IGuidGenerator.cs


Typically I then manually run a svn rm command for each of these, but this can get a little tedious after you've done it once. So since I'm using Cygwin, I can use some script to get the shell to do it automatically for me:


$ svn st | awk '$1 == "!"' | cut -c 2- | xargs svn rm
D src/Core/Generators/StandardDotNetUpperCaseGuidGenerator.cs
D src/Core/Generators/IGuidGenerator.cs


Now I don't want to have to remember to do this every time, so I add this to the .profile file in my Cygwin home directory:


alias svnrmdeleted='svn st | awk "\$1 == \"!\"" | cut -c 2- | xargs svn rm'


Now to remove from Subversion all deleted files I just type svnrmdeleted .

Update - Matt Ryall came back with the update to the above using awk rather than grep. (Otherwise you might end up deleting files with actual "!"'s in the name). He also grappled the required escaping to get the alias to work. (Thanks Matt!)

Update - Dan Bodart has since told me how to do this in plain Windows shell (Thanks Dan!):


  • for /f "usebackq tokens=2" %i in (`"svn st | findstr !"`) do svn rm %i


So, in summary - go use Subversion! Even if you are mostly a Windows user then try using its command line and ask some friendly UNIX- (or Windows Script-) savvy people how to automate some things you do often.


Confluence - A Wiki on steroids

Created 21 March 2004

[ Tech / Tools ]

Those Atlassian boys have done it again. First I became a fan of Jira, and now its looking good for Confluence too.

Confluence is at its most basic another wiki clone, but its a nice looking, easily editted one at that. After a few minutes using it though you realise why its going to blow all other Wiki's out of the water, with features like:

- arbitrary page hierarchies
- exports (of whatever selection of pages you want) to pdf and html
- separated 'spaces' for topic consolidation and security
- rss feeds for various events

But then after a few minutes more you suddenly get the 'wow factor' when you start using its dynamic macros - inlining things like

- search output, child page lists, etc.
- abitrary RSS feeds
- Jira issue reports

... etc. They've even done their own implementation of Fit called FatCow which you can just use like any other Macro. You can write your own macros too, just to add insane power to the application.

Its not perfect yet, but for a 1.0 release it has a very ambitious feature set so I'll let them off. Things that could be better:

- RSS and blogging features need some work - these are powerful things, but really they could be better (e.g. blogging in a context smaller than a space, better (or maybe more obvious to use) RSS outputs)
- Usability is good, but it still takes a while to get going as a new user. Jira's got better on this front, so I expect later versions of Confluence will be easier for newbies too.

We're using it already for CruiseControl.NET - check out our space here.


Older entries can be found in the Entry Index