Sharp Blue: Source tools


About This Article

comments feed

Tips Jar

Paypal Pixel


Typographically, source code is pretty ugly stuff. Almost every time I’ve seen program listings in books, they’ve been pretty much just plain ASCII text, because that’s what programs are usually stored as. People usually just switch over to a non-proportional font and paste in their code, and it looks horrible even if they’ve been sticking to nice coding conventions. For some reason, until I read Steve McConnell’s Code Complete I hadn’t really given any thought to an alternative (even though I use GoldEd, an editor with some quite sophisticated syntax parsing features). Then, while I was reading that book, one of the illustrations suddenly changed my thinking: a source listing with some proper typographic thought put into it. One aspect of this was that the comments weren’t demarked by language features but were in little grey boxes. There was lots of use of font sizes and weights to make syntactic features clear. There were even carefully positioned horizontal lines to enhance readability. None of these things can have been hard to implement, so why is it so hard to find tools to, say, format PHP source code nicely using LaTeX? Am I really going to have to write my own LaTeX output module for Beautifier?

This lack of decent typographic formats for source code is only one aspect of a larger void in the landscape of programming tools. It seems to me that much of programming is attempting to communicate an understanding of code from one progammer to others. On large projects, most of the programming time isn’t even taken up by the initial coding but by debugging and maintenance, and much of that later coding will be done by people other than the initial developers. A consequence of this is that there are a lot of people out there trying to understand code that they didn’t write, and there’s very little in the way of support tools to help them. Their best hope is that the people who first wrote the code went out of their way to make it easier to understand by chosing sensible variable and function names, documenting design decisions, commenting on intentionality and so forth. Even then, understanding small parts of large programs is not a straightforward task. There are plenty of debugging and profiling tools out there, but where are the code visualisation tools? (I know that the structure of the average program is a hard thing to display, but I’m sure we can do better than linear source listings or UML) And why isn’t there a common XML source format which can be transformed to XHTML/CSS or whatever for display? (Maybe there are such things but I haven’t stumbled across them yet!)

A related idea is that it might be nice to have “overlays” in source displays that could display meta-information by changing typographic characteristics such as brightness. For example, it would be nice to be able to scroll through a source listing with a ChangeRate overlay and have one’s attention drawn to those parts of the code that have changed most often (and so which are likely to be bad). Being able to see the age of code in a similar way would be pleasant too. Both of these things, of course, would require a much tighter integration of editors and source management tools than is usual.

I'm not an XML nut, but I do think that XML is the right way to go about this, because:

- It's the metastandard that everybody's using, so there are lots of tools we can use.

- Both XML and programming languages are based upon Context Free Grammars, so a large chunk of the work has already been done.

As far as I know it would be relatively straightforward to take a formal definition of a programming language's syntax (in Bachus-Naur Form), and convert it into an XML definition. Then create a style sheet for viewing the XML file in a web-browser, and we've created ourselves a source code viewer!

Once that's done, we can have all kinds of fun experimenting with different schemes. One easy win would be to come up with more useful ways of attaching comments to source code than just splicing them in. (Associate comments with clearly delimited blocks of code, and have two kinds of comments: ones that describe what the code does, and ones that describe how the code does it.)

It strikes me that it would be relatively straightforward task for somebody familiar with XML tools. Any takers?

The comments thing would be very easy because there's already such a means of associating metadata with XML documents and expressing relationships between such documents (or subsets thereof): the Resource Description Framework, part of the coming Semantic Web. I imagine that it would even be possible to extend the Semantic Web to cover the full edit history of every program and to use RDF to express changes in programs and other relationships between older and newer versions.

Have you tried following your GoldEd link lately? There have been one or two changes to the site...

Leave a comment