« What do you use to edit HTML? | Main | Lessons Proprietary Software Can Teach Open Source »

April 10, 2005

Information wants to be free (as in speech)

I know when this blog started life it was meant to be about writing software. But I write software so much that writing about writing software doesn't always have as much appeal as you might think. So I write about other stuff.

Often stuff that annoys me. Is it me, or is there no shortage of that?

From the particularly clueless department comes the US Congress (why is it parliaments keep generating so much nonsense?) which is considering enforcing a standardized DRM (digital rights management) scheme on the music industry. Why is this suddenly so pressingly important? Well apparently, because of "inconvenience for consumers". Frankly, that looks to me to be a load of cobblers, it seems consumers have no inconvenience buying an iPod and using iTunes :-). Rather, all of Apple's competitors who have missed the boat would stand to benefit. How convenient.

Meanwhile, proprietary document formats, not least of all Microsoft's Office formats show, show where real inconvenience (and far worse) lie not simply for consumers, but for entire industries.

So where is the US Congress interest in mandating that document formats used for government business, or required for public purposes (such as the common legal requirement that records be maintained for a period of time) must be in an open, published format? The US state of Massachusetts appears to be heading in this direction. Surely if moves are going to be made to enforce interoperability (which, of course, I am entirely for) then rather than addressing a minor aspect of an emerging market, it would make much more sense to address a genuinely important aspect of interoperability, namely office formats.

Why is this important? Two reasons really. One impacts on the creator of documents. In essence, when using a proprietary format, use of that content even by its creator is ultimately determined by the developer of the piece of software used to create it. Sure, the license terms of that software, or its fundamental operation are unlikely to change from day to day. But try opening a 10 year old word document in the latest version of word. Does it open an utterly faithful version of the document as created? Chances are that it won't. Is this really important? I'll let you decide for your own purposes.
And in another 5 years will it open at all?
No problem you might reply, I'll just run the old version of the app I used to create that document. While that might be feasible for the next 5 or 10 years, is it feasible for the next two or three decades?

Open published standardized formats provide a guarantee which no software developer (none, not even the very biggest) can. That your work will be accessible (in essence continue to exist) in perpetuity.
As documentation becomes increasingly electronic, this is not simply a luxury, it is fundamentally important. Our culture is really what we create. If we lose the ability to read, to listen to, to experience our cultural creations, we lose much of that culture.

There is a second reason why open document formats are profoundly beneficial to the entire online world, and increasingly the entire developed world. They provide a platform for innovation. By standardizing document formats, users are able to exercise genuine choice over the software they use, rather than being locked into whatever choice they or someone else made at whatever point they made it. We are all then much less dependent on the choices software developers make which might have more to do with their strategic advantage than it does about our needs as users.

OK, so how in theory do open formats benefit this? And in practice is there any evidence that this is not so much smithsonian (invisible) hand waving?
Right now if someone sends you a word document, how do you view it? You use Word. OK, some applications will translate Word formats and make the accessible, even editable (Apple's TextEdit and Pages are an obvious example). So we have interoperability right? Only on the most trivial level. Complex aspects of any document format, particularly when it is both unpublished, and liable to change are extremely difficult to make work properly. Even Microsoft has had reasonably well known difficulties when upgrading file formats, and this involved the team who actually developed the file formats in the first place, with full access to the existing file format documentation.
Stable, published document formats allow much greater and more reliable interoperability of applications like OpenOffice with your Office documents, as well as enabling smaller developers who might wish to specialize in providing particular functionality, say a useful grammar checker, that would work with whatever Word processing application a user chose to work with to develoop such software. This is not to say that this is impossible now, simply very difficult, and risky for a software developer on two fronts. One, because there is no guarantee that the underlying document format won't change at any time, and two, because the US DCMA (Digital Copyright Millennium Act) might actually make reverse engineering those file formats illegal (regardless of your opinion of that, I'll tell you the reality of small software developers. We steadfastly avoid any situation which has even the slightest whiff of potential legal issues like this. There are several features we have no developed for Style Master precisely because of this kind of issue).

But is this not all simply hand waving and theory? Well, let's compare two significant areas of software use over the last decade.
A decade ago, the web was a niche game. While emerging in popular consciousness, few people really used it. There were few published standards, and a large number of very small developers developing browsers, editors, servers, and other web software. Microsoft launched IE with Windows 95.
Over the next decade, while we might focus on the growing and until recently seemingly complete dominance of IE as a browser, looking at the web as an entire ecosystem, we find significant flux in almost every aspect of that ecosystem. There is no one dominant server app for the web (with numerous open source and commercial applications in this space). No web editing software dominates - several applications from large companies like Macromedia, Adobe and Microsoft remain viable, while hundreds if not thousands of small developers (like us) develop viable, in many cases innovative software which interoperate with one another and with the big players, because of the stable published underlying formats like HTML and CSS.
Even among browsers, several browsers remain more than viable, despite Microsoft's seeming dominance for most of this decade. And thank goodness for that, as IE has essentially been unchanged since the release of version 6 nearly half a decade ago.

Let's compare this with the space of Office productivity software.

In 1995, while Microsoft Office was certainly a commercial success, it had far from the almost complete dominance of Office in 2005. Today, Word Perfect Office and Lotus Smart Suite and OpenOffice between them haggle over at most 10% of the Office market (accurate figures are difficult to come across). The entire decade has seen a concentration of Office's market dominance.

Why the difference? There are many, from abysmal business decisions by competitors to users familiarity with their existing application, to Microsoft's enormous marketing budget (literally billions of dollars a year).
But above all else, once you start using Office (this is largely true of other such apps too) you are locked into the product because of all the content you create in that application's proprietary format. It is a huge risk to transition to another platform, regardless of any other benefits, for the fear of losing access to your existing documents.

So, in one field, despite the enormous marketing and R&D budgets, despite their ability to attract and keep the best and brightest people, Microsoft is one player among many (dominant in browsers, but in no other aspect of the web) while in another, the utterly dominate.
The difference? In both cases all the variables are almost identical And you can't accuse MS of not trying to dominate the web the way it has Operating Systems and Office apps. The difference is that on the web there are open, standardized document formats (HTML/XHTML, CSS, JPEG and so on) and other open standards (protocols like HTTP) while in the Office space there are no such levelers of the playing field.

There is a third and possibly most important reason than either of these, why open document formats are profoundly important.
The network effect.

Google alone indexes in excess of 8 billion web pages. Increasingly Google indexes not just HTML based pages, but many other kinds of format, among them Office formats. But this indexing suffers from, in the case of closed formats like office, similar problems as faced by developers of applications seeking interoperability with Office documents. Formats can change at any time, and Google must reverse engineer them to gain access to their content (my guess is they most likely index the plain text available in Office formats, with reference to little if any structured markup, like headings). This diminishes the value of content in such formats, and will increasingly so as search engines get smarter at using document semantics to help boost result relevance.

If, like me, you believe "information wants to be free", then this is about much more than the ability of markets to function as they should to provide us with solutions. Using proprietary formats constrains the freedom and value of information that is at odds with the world of the web.

So what can you do if you care?

Simple really. Use open published document formats as much as is feasible. Don't lock others into your file format choices by sending them proprietary formatted documents (this goes for any document format, not simply Microsoft's). Where feasible, send back any proprietary formatted documents to whomever sent them to you, asking they simply send them as plain text, or the appropriate open format.
Investigate alternatives to your current software which will read and write open standard formats.
Make this an issue when making decisions in future about system choices.

Many of us however have significant investments in proprietary formats, so making the change will require an effort, there is no doubt. But the benefits to us individually, and more importantly to the world of information will be profound.

Plus it is going to happen. Depend on it. You may as well get on the train early and get a good seat.

April 10, 2005 | Permalink


TrackBack URL for this entry:

Listed below are links to weblogs that reference Information wants to be free (as in speech):


1. You meant DMCA, not DCMA.

2. It will surprise many standardistas to learn that PDF is a published open format. It isn't true that Adobe owns it completely (though they're the ones who write new versions, e.g., versions that include tags for accessibility). In fact, at least three variations of PDF are winding their way through standards bodies for publication and ratification. HTML+CSS is just great most of the time, but when it isn't great, it's horrifically ungreat, as with complex documents containing footnotes and endnotes, neither of which exists at all in HTML. It is at least possible to specify those in PDF, for example.

Posted by: Joe Clark | Apr 11, 2005 3:16:38 AM

A very good post!! Thank you!

1. Surely you still don't have the deluded notions of politicians working for the betterment of the people rather than their own interests ? I know it's hard to accept the truth, but it is the truth. If you can prove me wrong, please do so, but I'd rather have you enjoy your life than waste it on a futile exercise. ;)

2. The point you made about (X)HTML/CSS vs M$ Word doc's was very clear and precise. I personally think that any digital format that reaches a particular majority usage point (60+ %) in the market should by default become an openly published format controlled by a Standards body like HTML and that any developer can implement in their apps without restrictions. That would have made M$ forced to deal with the competition on an open and fair basis, and I'm certain many of the irritating bugs inside M$ Word would have been solved long time ago.

3. As for iTunes and DRM formats. Well, I'm just waiting for the day the music biz wakes up and realises that their DRM model will not work in the long run, and therefore need to change. The creative people - like you guys @ WestCiv - deserves to get paid for your/our work, but for a monopoly cartel of distribution companies to earn many times more than the artists is just sick and disgusting! The iTMS (& others) is great, but with the artist getting less than 5-10% of the sales price/track, I'm not interested. If they got 40-70% of SP then I would happily buy their stuff there. In the meanwhile I'll continue being part of the 'silent' revolution called P2P, which I know hurts the artists, but should hurt the old exec farts even more! ;) Sooner or later the message has to come through to both artists/producers and record/media companies that it's time for change, and the sooner they change the better for the artists/producers !!

A very good example of this has already happened in the photography & photographic agency business. Less than 5-6 years ago a photographer had to be a member of an agency to have their pictures in an image library available for stock sales. The agency (A) then had distribution partners (= other agencies) in other countries/territories that sold the pictures from A as well as 10 more agencies. If a picture sold for $100 through another agency (B) then B generally took 50% of the sales price in commission, agency A then got $50 from B and took their cut, generally 50%, which left the photographer with $25 of the original sale. This was very much the norm in a business focused on shifting physical content within a limited physical area. With the digital image format and the internet this outdated business model was doomed. Smart photographers joined online agencies like Alamy.com, Corbis, etc. and got a fixed price from each sale to any territory that was much better than their previous payments. So agencies was no longer such an attractive proposition to photographers, and many previously successful agencies began to wither and are now slowly fading away.

The sad fact is that iTMS could be the salvation for the music artists, but the monopoly interests of the music distributors (aka record companies) prevents this freedom from existing or at least flourishing. And that is the true crime that should be investigated by the authorities !!

Posted by: Anonymous Coward | Apr 11, 2005 9:24:49 PM

Joe, as always you are right :-)

I get confused about SCMA vs DMCA, I think it is a commbination of

1. YMCA, by of course the Village people
2. Run DMC
3. The DMC (disco mix club), which used to be like *the* DJs organization back in the 80's and 90s. I think it is still around.
4. DCMs, this so bad it was good night club on Oxford street in Sydney in the 80s (it may still be there, but...)

so I get this musical confusion thing happening

Indeed you are right about PDF, and while not mentioning it explicitly, I have that in mind as a kind of model for open "proprietary" formats. Afterall, I never open Acrobat to read PDF files (Mac OS X Preview is very fast for that), and indeed PDF is rather intimately part of Mac OS X.


Posted by: John Allsopp | Apr 12, 2005 6:40:39 AM


I largely agree with you (though I try to be as fair as posible to artists, yet I agree, they are still getting royally screwed).

I think this is in part artists fault. Recently I saw an aticle which likened the music industry to a lottery, with a small number of artists who win big, and many many who don't. Its even worse than that because the lottery is stacked in favour of those who appeal to teenage girls. Yet time after time, artists follow the tried and tested route of seeking recording contracts, recording albums at a huge loss, seeking airtime on clearchannel stations...

Wake up guys. Do something different. Use the disruptive technologies of blogging, inexpensive professional digital recording (15 years ago or so when I was doing a little of this stuuff we would have murdered for something like garageband, let alone the "pro" gear out there) mp3s, podCasting, whatever. As a wise man once said "if you always do what you always did, you always get what you always got".


Posted by: John Allsopp | Apr 12, 2005 6:49:34 AM

The brand of [url=http://www.christian-louboutin.cc/]Christian Louboutin outlet[/url] was established way back in 1924, founded by Harry Grosberg in Victoria Place, Longton, Stoke-on-trent in Staffordshire right here in the United Kingdom, not too far away from where we at Moto central are actually based! Unique design. [url=http://www.newbelstaff.com/]belstaff outlet[/url] company later created weather-protective jackets for other uses, gloves and several other garments intended to keep the wearer warm, dry and safe. So Belstaff sale became a subsidiary of James Halstead in 1948a company also famous in later years for the success of the Australian brand Driza-Bone.

Posted by: Christian Louboutin | Nov 27, 2010 4:52:57 PM