 |
Spawn of the Devil?
With the move to creating even desktop applications using a markup language – XAML – perhaps it’s time to reconsider our love for markup. Mike James has some interesting thoughts on markup gone bad.
By Mike James
Published: 30 November 2006
Markup languages within almost any application architecture are the spawn of the Devil. In the right hands and with extreme care they can be tamed, but in many instances they are disasters simply waiting to be let out of confinement to do their worst to a design.
When the web was first invented the use of HTML as a markup language was inspired. It was essentially a derivative of SML (Standard Markup Language) and the precursor of XML, the current favourite of the markup world. When it was first used, HTML was suited to the task to which it was put. It was used to format a document and provide some simple dynamic elements – specifically hyperlinks. In this role as a formatting language there is little to dislike in any markup language, and indeed there is much to be admired. However, as a document becomes increasingly dynamic things begin to go wrong, and the relationship between form and function becomes increasingly messy.
The first problems started with the introduction of the HTML form and the use of scripting languages to provide ever-higher levels of user interaction. The markup language describes the user interface in terms of buttons, textboxes, images and so on and the script defines the behaviour. In other words, the markup is declaratory and the script is procedural. In theory this sounds like an ideal architectural separation of concerns – and it would be if the two were truly separate.
The problem is well known to any web programmer, and has been much discussed. The script becomes distributed within and between the markup language, making it difficult to see how everything works and making the whole architectural approach brittle. HTML tags assign names of objects manipulated by the script and make use of names defined in the script as event handlers. The script code can be broken up into fragments and even defined within tags. With this free mixing comes a very real difficulty in determining the flow of control, and managing name spaces is next to impossible. The result is that reading through a scripted web page can require a huge effort, and there is no way to be sure that its behaviour is understood.
This is very similar to the situation back in the early days of programming when the “Goto” or jump command used in a freestyle way produced opaque brittle code usually disparaged by the term “spaghetti code”. Today scripted web pages are similarly opaque and brittle due to the lack of structure that markup languages impose on code and vice versa. As always it is possible to craft something admirable, but the effort required is excessive, and the discipline needed has to be self-imposed. In effect the programmer has to invent a programming methodology which constrains the system to a more reasonable behaviour. If the programmer doesn’t exhibit such restraint on the system’s behalf the result is an unmanageable mess.
XML the hero of markup
One of the reasons it is difficult to criticise markup languages per se is the amazing success of XML. This appears to be a logical and clean approach to the description of all manner of data structures. The fact that XML really only encapsulates tree structured data is something that is often over looked, as is the fact that it doesn’t do the job particularly efficiently. All in all it’s becoming an increasingly over-elaborated mess in its own right, with the emphasis on the extensible … but it’s still flavour of the month. It seems that XML can do no wrong and as such it finds itself being used for just about everything imaginable. XML is taking over HTML, it’s taking over database storage, it’s taking over all manner of transaction-based protocols – and now it’s taking over the design of desktop user interfaces. Microsoft’s latest .NET 3.0 framework introduces XAML – eXtensible Application Markup Language – as part of the Windows Presentation Foundation (WPF). Overall WPF is an improvement over the existing Windows API, and offers the prospect of a coherent and hardware-accelerated graphics system – however, XAML is a high profile component.
Alternatives to markup?
If you have been over-exposed to web design you might believe that there is no sensible alternative to using a markup language to define a user interface declaratively, but other approaches based on the standard procedural paradigm have proved more than adequate. For example, VB6 provided one of the first drag-and-drop designers that made creating user interfaces easy, and it didn’t use a markup language. Well this isn’t quite accurate. It recorded the details of the interface using its own internal language, which programmers could see if they listed the saved code, but the important point was that it was never intended to be a language used to hand code or tinker with user interfaces. It was simply a way of persisting the details of the interface as created by the designer and the property editor. There was never an intention nor a need to expose the details of the encoding used to persist a layout.
In many ways a more logical approach to the same problem can be seen in Java, and more lately in C# and VB.NET. In this case the user interface is constructed by instantiating appropriate classes and by setting properties in code. That is, the user interface is constructed in a procedural way no different from any other aspect of application coding. However, hand coding user interfaces via procedural code is hard work, and the clever part is the introduction of drag-and-drop designers that can “round trip” the code much like UML modelling packages. This means that given procedural code the designer can generate the interface and allow the programmer to modify it, using drag-and-drop or by direct editing of the code. The designer keeps the code and the interface in sync and does its best to generate and work with human generated code.
This is an excellent way to work, and its only downside is that in practice the typically large amount of unfriendly-looking generated code tends to confuse and frighten the beginner. For this reason many code-based UI editors use one method or another to hide most of the generated code – .NET partial classes are an example of the need to do this. It is debatable if code hiding is the best approach. Making the generated code look easy to read and highly understandable would be a better approach. Similarly making the editor truly “round trippable” would also increase the ease of use by allowing the programmer to modify generated code and see the result in the editor. Unfortunately the UI editor in Visual Studio, for example, has never been tolerant of changes to its generated code and makes no attempt to generate human readable code. It is possibly this missed opportunity that has allowed markup languages to move into the same territory. It can be done, however, as proved by a number of Java-based UI editors that create clean and re-editable code.
Procedural code bad?
So what is so wrong with procedural code that it has to be removed from the construction of user interfaces? This is a very difficult question, but there seems to be a general opinion, that is growing in its support, that a UI should be created using declarative statements. This seems to stem from the example of the HTML-based web page, even though we know that this approach only results in a mess. Do modern programmers really look at the HTML page and think that this is the way to do things?
Fortunately not everyone is of this opinion. Indeed you can see the Java and the .NET 2.0 approach to UI construction using interactive round trip editors and procedural code as a way of avoiding the markup mess. There are even approaches to building web pages such as Spartan Ajax that actually attempt to move in the procedural direction by taking as their motto “no more HTML”! In this case the situation is turned on its head in that rather than removing JavaScript from the web page, it’s HTML that is removed and the entire page is generated procedurally using Ajax techniques. All of these developments have happened just at the time that Microsoft introduces XAML into .NET 3.0.
Code behind, code in front, code all over the place
Of course Microsoft already has a proposed solution to the markup mess in ASP.NET, in the form of the code-behind architecture. This attempts to separate the markup and the procedural code into separate documents that have clearly defined ways of interacting. XAML can be used in this code-behind mode, but it can also be used with code embedded. You can define events and triggers within the markup, and even introduce functions and blocks of code within it. This is a retrogressive step to say the least, and it’s definitely possible to create XAML pages that are every bit as opaque and brittle as a web page plus script. When you add to this the fact that the whole system introduces multiple ways of doing the same task you can see that it’s not a good idea. Attributes within tags, content between tags, property tags, meta tags defining names that are visible in code, the problems of having to force everything into a tree structure, and code that isn’t within the markup structure at all quickly becomes a nightmare.
Of course you can use XAML and code mixed in a disciplined way, but then it was possible to use the Goto without making a mess. The point is that we developed structured languages to force the less than perfect programmer to be more perfect. If you can misuse a language facility you can guarantee that it will be misused and we will have to pay the price in bugs and high maintenance costs. Freedom of mode of expression is for poetry not programming.
The strange twist to this story is that most programmers will create XAML-based UIs via a drag-and-drop editor and probably never give the generated markup language a second look. When the application is run the XAML is converted by the system to procedural code of the sort that would have been created by any .NET 2.0 language instantiating classes and setting properties. This really does raise the question of why an intermediate markup language is needed at all? You might say that it’s so that a subset of the XAML can be loaded into a web browser, but such a scheme requires a new browser plug-in that could just as easily interpret the C# or the VB procedural code. Equally, the designer could just as easily output C# or VB procedural code. If the argument for using XAML as an intermediate code is that a single designer can generate code for either .NET language, then it would be just as appropriate to keep its details hidden from the average programmer. The intermediate design language could have been kept hidden and could have used whatever structures it needed to capture the details of the design rather than trying to force it all into a tree structured markup hierarchy.
Put simply, we just don’t need yet another markup language.
Return to Articles
|
 |