The Web of Alexandria (follow-up)

Bret Victor / May 26, 2015

A follow-up (disambiguation? expansion? context?) regarding The Web of Alexandria.

* * *

In The Web of Alexandria, I suggested that some very stable and reliable media, DNA and print, owe their stability and reliability to replication and retention -- every reader gets a copy, and every reader keeps their copy. The web, on the other hand, follows the strategy used for books before the printing press -- put a single copy in an institution, allow readers to come visit, hope it doesn't go up in smoke.

Whenever the ephemerality of the web is mentioned, two opposing responses tend to surface. Some people see the web as a conversational medium, and consider ephemerality to be a virtue. And some people see the web as a publication medium, and want to build a "permanent web" where nothing can ever disappear.

Neither position is mine. If anything, I see the web as a bad medium, at least partly because it invites exactly that conflict, with disastrous effects on both sides.

* * *

In "As We May Think", Vannevar Bush is concerned explicitly with the "common record" -- humanity's grand accumulation of art and knowledge:

Science has provided the swiftest communication between individuals; it has provided a record of ideas and has enabled man to manipulate and to make extracts from that record so that knowledge evolves and endures throughout the life of a race rather than that of an individual.

When he gets specific about what exactly the record consists of, he appears to identify it with what we might call "published material":

If the human race has produced since the invention of movable type a total record, in the form of magazines, newspapers, books, tracts, advertising blurbs, correspondence, having a volume corresponding to a billion books...

It is this common record of public thought -- the "great conversation" -- whose stability and persistence is crucial, both for us alive today and for those who will come after.

Jill Lepore: The Cobweb

For the law and for the courts, link rot and content drift, which are collectively known as “reference rot,” have been disastrous. In providing evidence, legal scholars, lawyers, and judges often cite Web pages in their footnotes; they expect that evidence to remain where they found it as their proof, the way that evidence on paper—in court records and books and law journals—remains where they found it, in libraries and courthouses. But a 2013 survey of law- and policy-related publications found that, at the end of six years, nearly fifty per cent of the URLs cited in those publications no longer worked. According to a 2014 study conducted at Harvard Law School, “more than 70% of the URLs within the Harvard Law Review and other journals, and 50% of the URLs within United States Supreme Court opinions, do not link to the originally cited information.”

The overwriting, drifting, and rotting of the Web is no less catastrophic for engineers, scientists, and doctors. Last month, a team of digital library researchers based at Los Alamos National Laboratory reported the results of an exacting study of three and a half million scholarly articles published in science, technology, and medical journals between 1997 and 2012: one in five links provided in the notes suffers from reference rot. It’s like trying to stand on quicksand.

To forget the past is to destroy the future. This is where Dark Ages come from.

However...

* * *

Photos from your friend's party are not part of the common record.

Nor are most casual conversations. Nor are search histories, commercial transactions, "friend networks", or most things that might be labeled "personal data". These are not deliberate publications like a bound book; they are not intended to be lasting contributions to the public discourse.

The fact that they can persist -- that the medium is constructed such that this data accidentally travels further and lasts longer than anyone intended, and that this data is easily retained and exploited by intermediaries -- is an enormous and terrifying problem of its own.

Maciej Cegłowski: The Internet With A Human Face

I've come to believe that a lot of what's wrong with the Internet has to do with memory. The Internet somehow contrives to remember too much and too little at the same time, and it maps poorly on our concepts of how memory should work.

In our elementary schools in America, if we did something particularly heinous, they had a special way of threatening you. They would say: "This is going on your permanent record".

It was pretty scary. I had never seen a permanent record, but I knew exactly what it must look like. It was bright red, thick, tied with twine. Full of official stamps.

The permanent record would follow you through life, and whenever you changed schools, or looked for a job or moved to a new house, people would see the shameful things you had done in fifth grade.

How wonderful it felt when I first realized the permanent record didn't exist. They were bluffing! Nothing I did was going to matter! We were free!

And then when I grew up, I helped build it for real.

* * *

It doesn't make sense to make blanket statements like "content on the web should be persistent" or "content on the web should be ephemeral". Instead, we need to recognize that this "web" thing is conflating two very different forms of discourse, forms that used to be clearly and deliberately distinct.

The "web" is not a part of nature. It was not discovered; we don't have to just accept it. The "web" is an infrastructural system that was built by people, and it was built very recently and very sloppily. It currently has the property that it forgets what must be remembered, and remembers what must be forgotten. It manages to screw up both the sacredness of the common record and the sacredness of private interaction.

For people who have grown up with HTTP and URLs, it can be hard to see anything wrong with them. The tendency is to blame individual behavior -- "You should have mirrored that data!" "You shouldn't have put those photos online!" But the technical properties of a medium shape social practice, and if the resulting social practice is harmful, it's the medium that is at fault.

* * *

If your goal is to fix this broken medium, please consider that, historically, people have relied on different media for different social purposes, and have relied on a clear understanding of how the technical properties of each medium determine the social and temporal scope of its messages.

Think about speech, letters, newspapers, books, smoke signals... Each medium serves only a particular subset of social purposes, and each medium is technically transparent enough that people can understand what's happening when they use it.

The web is a single, increasingly complex infrastructure which has been adopted for mutually incompatible purposes. And almost nobody has a clear understanding of what the hell is actually happening when they type into a box on their screen.

There are currently well-meaning forces at work to improve the web for some particular purpose, and they generally have the effect of making the overall system more complex, less transparent, and more hostile to other purposes. For example, people who see the web as a publication medium have proposed persistent, content-addressed schemes such as IPFS, with (as far as I can see) little provision for the necessary ephemerality of personal and conversational data. And people who see the web as a conversational medium or application platform have proposed policies such as mandatory encryption, which could have detrimental effects on a stable and accessible long-term common record.

It might be that the notion of "the" web -- in the sense of a single complex protocol supporting all human communicative purposes -- isn't a very good idea.