Re: destabilizing core technologies: was Re: An RDF wishlist
Dan, A somewhat longer response with references to some of the discussion on the list yesterday. On 7/1/2010 6:30 AM, Dan Brickley wrote: Hi Patrick, snip I don't know what else to call the US Department of Defense mandating the use of SGML for defense contracts. That is certainly real-world and it seems hard to step on an economic map of the US without stepping in defense contracts of one sort or another. Yes, you are right. It is fair and interesting to bring up this analogy and associated history. SGML even got a namecheck in the original announcement of the Web, see http://groups.google.com/group/alt.hypertext/msg/395f282a67a1916c and even today HTML is not yet re-cast in terms XML, much less SGML. Many today are looking to JSON rather than XML, perhaps because of a lack of courage/optimism amongst XMLs creators that saddled it with more SGML heritage than it should now be carrying. These are all reasons for chopping away more bravely at things we might otherwise be afraid of breaking. But what if we chop so much the original is unrecognisable? Is that so wrong? What if RDF's biggest adoption burden is the openworld triples model? Without pre-judging what needs to be changed (if anything), consider the history of HTML 3.2. Possibly the most successful markup language of all time. HTML 3.2 did not: 1) Have puff pieces extolling its virtues in Scientific American 2) It did not have a cast of thousands of researchers publishing papers and dissertations 3) It did not have near the investment in $Millions in various software components 4) It did not take ten years and even then have people are lamenting its lack of adoption HTML 3.2 did have: 1) *A need perceived by users as needing to be met* 2) *A method to meet that need that was acceptable to users* Note for purposes of the rest of this post I concede that I *don't know* the answer to either of those questions for any semantic integration technology. If I did, that answer would have been on the first lines of this post. What I am suggesting is that we need to examine those two questions instead of deciding that RDF was right all along and users are simple failing to do their part. Clinging to decisions that seemed right at the time they were made is a real problem. It is only because we make decisions that we have the opportunity to look back and wish we had decided differently. That is called experience. If we don't learn from experience, well, there are other words to describe that. :) So, I wouldn't object to a new RDF Core WG, to cleanups including eg. 'literals as subjects' in the core data model, or to see the formal semantics modernised/simplified according to the latest wisdom of the gurus. I do object to the idea that proposed changes are the kinds of thing that will make RDF significantly easier to deploy. The RDF family of specs is already pretty layered. You can do a lot without ever using or encountering rdf:Alt, or reification, or OWL DL reasoning, or RIF. Or reading a W3C spec. The basic idea of triples is pretty simple and even sometimes strangely attractive, however many things have been piled on top. But simplicity is a complex thing! Having a simple data model, even simple, easy to read specs, won't save RDF from being a complex-to-use technology. The proposed changes *may not* be the ones that are required. On the other hand, saying that we have invested $millions in RDF as is so it has to work should result in a return of false from any reasoning engine. Including human ones. The amount of effort expended in trying to stop the tides may be enormous that that isn't a justification of a technique or even the project. We have I think a reasonably simple data model. You can't take much away from the triples story and be left with anything sharing RDF's most attractive properties. The specs could be cleaner and more accessible. But I know plenty of former RDF enthuasiasts who knew the specs and the tech inside out, and still ultimately abandoned it all. Making RDF simpler to use can't come just from simplifying the specs; when you look at the core, and it's the core that's the problem, there just isn't much left to throw out. It may not be a problem with RDF at all. It could be that RDF is an entirely consistent and useful model, but just not for any problem that users see. That is really the question for adoption isn't it? Do users see the same problem? If they don't, building systems for problems they don't see seems like a poor strategy. Moreover, knowing they don't see it and continuing with the same activity and expecting a different result, well, you know what repeating the same thing and expecting a different result is. Some of the audience for these postings will remember that the result of intransigence on the part of the SGML community was XML. XML was a giant gamble. It's instructive to look back at what happened, and
Re: destabilizing core technologies: was Re: An RDF wishlist
Patrick, Without disputing your wider point that HTML hit the sweet point of usability and utility I will dispute the following: HTML 3.2 did have: 1) *A need perceived by users as needing to be met* Did users really know they wanted to link documents together to form a world wide web? I spent much of the late nineties persuading companies and individuals of the merits of being part of this new web thing and then gritting my teeth when it came to actually showing them how to get a page online - it was a painful confusion of text editors ( no you can't use wordperfect ), fumbling in the dark ( no wysiwyg ), dialup ( you mean I have to pay?) and ftp! When MS frontpage came along the users loved it because all that pain went away but they could not understand why so many people laughed at the results. I think we all have short memories. The advantage that HTML had was that people were able to use it before creating their own, i.e. they were aleady reading websites so could at some point say I want to make one of those. The problem RDF is gradually overcoming is this bootstrapping stage. It has a harder time because, to be frank, data is dull. But now people are seeing some of the data being made available in browseable form e.g. at data.gov.uk or dbpedia and saying, I want to make one of those. Ian
Re: destabilizing core technologies: was Re: An RDF wishlist
Ian, On 7/2/2010 5:25 AM, Ian Davis wrote: Patrick, Without disputing your wider point that HTML hit the sweet point of usability and utility I will dispute the following: HTML 3.2 did have: 1) *A need perceived by users as needing to be met* Did users really know they wanted to link documents together to form a world wide web? I spent much of the late nineties persuading companies and individuals of the merits of being part of this new web thing and then gritting my teeth when it came to actually showing them how to get a page online - it was a painful confusion of text editors ( no you can't use wordperfect ), fumbling in the dark ( no wysiwyg ), dialup ( you mean I have to pay?) and ftp! When MS frontpage came along the users loved it because all that pain went away but they could not understand why so many people laughed at the results. Well, possibly. I am not sure that is how users saw the need. That's the rub, I think it is hit or miss. In the publishing area where I worked when the web came along, it was a question of being able to make low return material available to a wider audience for less distribution cost. Not so much being part of a linked web as making material accessible. How many users saw it that way I cannot say. I think we all have short memories. The advantage that HTML had was that people were able to use it before creating their own, i.e. they were aleady reading websites so could at some point say I want to make one of those. The problem RDF is gradually overcoming is this bootstrapping stage. It has a harder time because, to be frank, data is dull. But now people are seeing some of the data being made available in browseable form e.g. at data.gov.uk or dbpedia and saying, I want to make one of those. Good point. But the basic tools to handle data have been around for a long time. Why so long to get to the place where users can say: I want to make one of those. ? Which I agree is a very good strategy. Hope you are having a great day! Patrick Ian -- Patrick Durusau patr...@durusau.net Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Re: destabilizing core technologies: was Re: An RDF wishlist
On 2 Jul 2010, at 11:39, Patrick Durusau wrote: Good point. But the basic tools to handle data have been around for a long time. The web could only get going in the 90ies when 1) Windows 95 become (A GUI) widely deployed and relatively stable and had support for threads 2) modems were cheap and available [3 the soviet unions had fallen, so the fear mongers had no security buttons to press] In 1997 the SSL layer (https) gave an extra boost as it made commerce possible. Why so long to get to the place where users can say: I want to make one of those. ? There are many reasons, but most of all is that people don't in fact understand hypertext, as being linked information. Or the people in charge of data don't think of it that way easily. Engineers have for 50 years been educated in closed world systems, every programming language including Prolog and lisp have local naming conventions that don't scale globally, and database people make a fortune with SQL. The people interact only very lightly with the web. Usually there is a layer of Web Monkeys in between them and the web. So when you ask those engineers to build a global distributed information system, they come up with the closest to what they know - which is remote method calls - and they invent XML/RPC which leads to SOAP. So it is not easy to get the knowledgeable people on board. The Web Monkeys are not very good at modelling, and the back end engineers don't understand the web. Finally the business people have problems understanding abstract concepts such as network effect. It just took time then to do a few demos, which the University of Berlin put together, slowly getting other people on board. It just takes time to rewire the brain of millions of people. Henry
Re: destabilizing core technologies: was Re: An RDF wishlist
Hi Ian, But now people are seeing some of the data being made available in browseable form e.g. at data.gov.uk or dbpedia and saying, I want to make one of those. I don't really believe that people would say after browsing dbpedia I want to make one of those. That's not the User Experience users expect to get. Please remember the Semantic-Web-UI discussion last time. People are tending to use/experience richer visualisations of the data/knowledge/information in the background. I hear often, especially in the last time, the term 'story telling' - and that's it, I think. Cheers, Bob
Re: destabilizing core technologies: was Re: An RDF wishlist
Henry, On 7/2/2010 5:58 AM, Henry Story wrote: On 2 Jul 2010, at 11:39, Patrick Durusau wrote: Good point. But the basic tools to handle data have been around for a long time. The web could only get going in the 90ies when 1) Windows 95 become (A GUI) widely deployed and relatively stable and had support for threads 2) modems were cheap and available [3 the soviet unions had fallen, so the fear mongers had no security buttons to press] In 1997 the SSL layer (https) gave an extra boost as it made commerce possible. Err, you are omitting one critical fact. The one that lead to TBL's paper being rejected from the hypertext conference. Links could fail. That is reportedly one of the critical failings of early hypertext systems was that links could not be allowed to fail. That blocked any sort of global scaling. Hmmm, wonder what happens when links fail with RDF, considering that it requires the yet to be implemented 303 solution? Why so long to get to the place where users can say: I want to make one of those. ? There are many reasons, but most of all is that people don't in fact understand hypertext, as being linked information. Or the people in charge of data don't think of it that way easily. Sorry, I don't understand what you mean by: that people in fact don't understand hypertest, as being linked information. ??? Engineers have for 50 years been educated in closed world systems, every programming language including Prolog and lisp have local naming conventions that don't scale globally, and database people make a fortune with SQL. The people interact only very lightly with the web. Usually there is a layer of Web Monkeys in between them and the web. The reason I don't understand your earlier point or this one is that users for hundreds of years have been well familiar with texts making references to other texts, which to my mind qualifies as hypertext, even if it did not have the mechanical convenience of HTML. What else do you think hypertext would be? So when you ask those engineers to build a global distributed information system, they come up with the closest to what they know - which is remote method calls - and they invent XML/RPC which leads to SOAP. So it is not easy to get the knowledgeable people on board. The Web Monkeys are not very good at modelling, and the back end engineers don't understand the web. Finally the business people have problems understanding abstract concepts such as network effect. It just took time then to do a few demos, which the University of Berlin put together, slowly getting other people on board. It just takes time to rewire the brain of millions of people. Well, I am not so sure that we need to rewire the brain of millions of people. so much as we need to have our technologies adapt to them. Yes? Granting that consumers can and do adapt to some technologies, the more consistent a technology is with how people think and work the easier its adoption. Yes? Hope you are looking forward to a great weekend! Patrick -- Patrick Durusau patr...@durusau.net Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Re: destabilizing core technologies: was Re: An RDF wishlist
Henry, On 7/2/2010 7:11 AM, Henry Story wrote: snip Well, I am not so sure that we need to rewire the brain of millions of people. so much as we need to have our technologies adapt to them. Yes? When it was discovered that the earth was round, the brains of everyone on earth had to be rewired. Of course people only selectively did that. Those who believed it and understood the implications rewired the brains of just enough people so they could make fortunes, colonise whole continents and rule the world for centuries. Actually the discovery that the world was round, I assume you mean by Columbus, is a myth. It was well know that the earth was round long before then. The disagreement was about how large the earth was. As a matter of fact Columbus had seriously under estimated the distance to make the voyage. Nor did he or any of the opponents to his voyage expect to discover a new continent. The most accessible account on the myth about the world being flat is Umberto Eco's Serendipities: Language and Lunacy. 1998 - See the chapter The Force of Falsity. Hope you are having a great day! Patrick -- Patrick Durusau patr...@durusau.net Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Re: destabilizing core technologies: was Re: An RDF wishlist
Bob Ferris wrote: Hi Ian, But now people are seeing some of the data being made available in browseable form e.g. at data.gov.uk or dbpedia and saying, I want to make one of those. I don't really believe that people would say after browsing dbpedia I want to make one of those. s/people/organizations/g. No, they don't say: I want to make one of those, they say things like: I would like to have one of those. Very similar to organizations (an people) saying: I want a Web Site. They key to any tech adoption (in the real world) ultimately comes down to making opportunity costs palpable. It's always ultimately about tangible (rather than hypothetical value). That's not the User Experience users expect to get. Please remember the Semantic-Web-UI discussion last time. UI is not the issue, that's such a misconception. Netscape, Google, Amazon, eBay, Yahoo! etc.. started off with what many would call darn ugly Web Sites. They key to their success was using HTML to construct short paths to: 1. Value Discovery 2. Opportunity Cost Palpability. Linked Data is ultimately about loose coupling of Information and Data (which aren't the same thing). Basically, enabling us to free ourselves of the inherent subjectivity of all projected information via access to The Data Sources Behind The Information. We simply need user interaction patterns that build on the burgeoning Linked Data substrate. For those who continue to be confused about Web 2.0 (a realm that emerged fundamentally as a contemptuous response to the hypothesis heavy RDF) look at how it came to be: 1. Feeds 2. Feed Syndication 3. Pingers 4. Friending. All of the items above represent patterns for social interaction via the Web. People are tending to use/experience richer visualisations of the data/knowledge/information in the background. I hear often, especially in the last time, the term 'story telling' - and that's it, I think. Story telling also works well. That said, RDF's story remains one of the very worst ever told IMHO. Kingsley Cheers, Bob -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: destabilizing core technologies: was Re: An RDF wishlist
Bob Ferris wrote: Hi Ian, Am 02.07.2010 12:26, schrieb Ian Davis: On Fri, Jul 2, 2010 at 11:13 AM, Bob Ferrisz...@elbklang.net wrote: Hi Ian, But now people are seeing some of the data being made available in browseable form e.g. at data.gov.uk or dbpedia and saying, I want to make one of those. I don't really believe that people would say after browsing dbpedia I want to make one of those. That's not the User Experience users expect to get. Please remember the Semantic-Web-UI discussion last time. People are tending to use/experience richer visualisations of the data/knowledge/information in the background. I hear often, especially in the last time, the term 'story telling' - and that's it, I think. Actually there is a class of people that do say that. They want to be the dbpedia of X, whatever X is. No matter how much we can criticise dbpedia for its appearance or data quality, we have to applaud the fact that it defined a new category of service. You are right, I welcomed it also, when people are saying after they have browsed dbpedia - I want to make one of those. However, I believe also that the number X representing these people, is much smaller as the number Y of people wanting a richer User Experience. Cheers, Bob Bob, Just as the DBpedia node lead to the LOD Cloud. There is similar movement (vertical and horizontal) re. organizations seeking their private and/or service specific variants. You would be quite surprised at the number of DBpedia (and other LOD cloud nodes) variants already operating as private lookup oriented data spaces within organizations. This train left the station a long time ago. People want the kind of valuable experience that dense lookup meshes like DBpedia (and the rest of LOD) accord. What is Google when all is said an done? A huge Table (geographically splintered across a massive physical data storage complex). People want to Find Stuff with Precision. That's one example of what Linked Data ultimately delivers without the underlying costs of a Google style data complex. We just need to continue to orient ourselves (Linked Data technology vendors) towards better user interaction patterns that align to problems that have reach breaking point with users. Another example is an Open Social Web. Privacy matters, and there's lots of stuff from the Linked Data realm (e.g. WebIDs, FOAF+SSL, ACLs, Cloud Storage etc..) that will make this happen too. BTW - thanks to veering this conversation to the practical rather than theoretical! -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: destabilizing core technologies: was Re: An RDF wishlist
Patrick Durusau wrote: Henry, On 7/2/2010 5:58 AM, Henry Story wrote: On 2 Jul 2010, at 11:39, Patrick Durusau wrote: Good point. But the basic tools to handle data have been around for a long time. The web could only get going in the 90ies when 1) Windows 95 become (A GUI) widely deployed and relatively stable and had support for threads 2) modems were cheap and available [3 the soviet unions had fallen, so the fear mongers had no security buttons to press] In 1997 the SSL layer (https) gave an extra boost as it made commerce possible. Err, you are omitting one critical fact. The one that lead to TBL's paper being rejected from the hypertext conference. Links could fail. And there lies the critical misconception. 404 is the feature that debunked the misconception and lead to the Web scaling. How about that? What happened to the Hypertext Conference? What happened to the Web and zillion Web themed conferences later? The ingenuity of the 404 is that is maps reality. We are all imperfect, our imperfections are features rather than bugs! TimBL flipped the script. More flipping to come re. business model too, but you need to grok the Magic of Being You! first. That is reportedly one of the critical failings of early hypertext systems was that links could not be allowed to fail. That blocked any sort of global scaling. Hmmm, wonder what happens when links fail with RDF, considering that it requires the yet to be implemented 303 solution? See my comments above. The pursuit of globally perfect statements in the RDF realm is one of its many flaws. If I recall, Pat Hayes once said to me: Take DBpedia down! Just because of some erroneous data. My reply to him was: No, I can fix the records without taking DBpedia down, the data is in a DBMS where the data is partitioned using Named Graphs. etc.. As of today, he even suggest I stop using RDF (truly LOL!!). [SNIP] -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: destabilizing core technologies: was Re: An RDF wishlist
Hi Patrick, On Thu, Jul 1, 2010 at 11:39 AM, Patrick Durusau patr...@durusau.net wrote: Dan, Just a quick response to only one of the interesting points you raise: It's clear that many workshop participants were aware of the risk of destabilizing the core technologies just as we are gaining some very promising real-world traction. That was a relief to read. For those who have invested time and money in helping us get this far, and who had the resources to participate, this concern was probably enough to motivate participation. It might be helpful to recall that destabilizing the core technologies was exactly the approach that SGML took when its little annoyances [brought] friction and frustration to those working with [SGML]... There was ...promising real-world traction. I don't know what else to call the US Department of Defense mandating the use of SGML for defense contracts. That is certainly real-world and it seems hard to step on an economic map of the US without stepping in defense contracts of one sort or another. Yes, you are right. It is fair and interesting to bring up this analogy and associated history. SGML even got a namecheck in the original announcement of the Web, see http://groups.google.com/group/alt.hypertext/msg/395f282a67a1916c and even today HTML is not yet re-cast in terms XML, much less SGML. Many today are looking to JSON rather than XML, perhaps because of a lack of courage/optimism amongst XMLs creators that saddled it with more SGML heritage than it should now be carrying. These are all reasons for chopping away more bravely at things we might otherwise be afraid of breaking. But what if we chop so much the original is unrecognisable? Is that so wrong? What if RDF's biggest adoption burden is the openworld triples model? Clinging to decisions that seemed right at the time they were made is a real problem. It is only because we make decisions that we have the opportunity to look back and wish we had decided differently. That is called experience. If we don't learn from experience, well, there are other words to describe that. :) So, I wouldn't object to a new RDF Core WG, to cleanups including eg. 'literals as subjects' in the core data model, or to see the formal semantics modernised/simplified according to the latest wisdom of the gurus. I do object to the idea that proposed changes are the kinds of thing that will make RDF significantly easier to deploy. The RDF family of specs is already pretty layered. You can do a lot without ever using or encountering rdf:Alt, or reification, or OWL DL reasoning, or RIF. Or reading a W3C spec. The basic idea of triples is pretty simple and even sometimes strangely attractive, however many things have been piled on top. But simplicity is a complex thing! Having a simple data model, even simple, easy to read specs, won't save RDF from being a complex-to-use technology. We have I think a reasonably simple data model. You can't take much away from the triples story and be left with anything sharing RDF's most attractive properties. The specs could be cleaner and more accessible. But I know plenty of former RDF enthuasiasts who knew the specs and the tech inside out, and still ultimately abandoned it all. Making RDF simpler to use can't come just from simplifying the specs; when you look at the core, and it's the core that's the problem, there just isn't much left to throw out. Some of the audience for these postings will remember that the result of intransigence on the part of the SGML community was XML. XML was a giant gamble. It's instructive to look back at what happened, and to realise that we don't need a single answer (a single gamble) here. Part of the problem I was getting at earlier was of dangerously elevated expectations... the argument that *all* data in the Web must be in RDF. We can remain fans of the triple model for simple factual data, even while acknowledging there will be other useful formats (XMLs, JSONs). Some of us can gamble on lets use RDF for everything. Some can retreat to the original, noble and neglected metadata use case, and use RDF to describe information, but leave the payload in other formats; others (myself at least) might spend their time trying to use triples as a way of getting people to share the information that's inside their heads rather than inside their computers. I am not advocating in favor of any specific changes. I am suggesting that clinging to prior decisions simply because they are prior decisions doesn't have a good track record. Learning from prior decisions, on the other hand, such as the reduced (in my opinion) feature set of XML, seems to have a better one. (Other examples left as an exercise for the reader.) So, I think I'm holding an awkward position here: * massive feature change (ie. not using triples, URIs etc); or rather focus change: become a data sharing in the Web community not a doing stuff with triples community * cautious
Re: destabilizing core technologies: was Re: An RDF wishlist
Dan and all, hello. On 2010 Jul 1, at 11:30, Dan Brickley wrote: Yes, you are right. It is fair and interesting to bring up this analogy and associated history. SGML even got a namecheck in the original announcement of the Web, [...] So, I think I'm holding an awkward position here: * massive feature change (ie. not using triples, URIs etc); or rather focus change: become a data sharing in the Web community not a doing stuff with triples community * cautious feature change (tweaking the triple model doesn't have many big wins; it's simple already) This is a very thought-provoking argument. There's a third position: no feature change, but change our goals. Myself, I finally 'got' SGML (and by implication XML), when I struggled through the HyTime spec (anyone remember that?), and finally really properly understood the extent to which it was the _abstraction_ of structured information that was the point of it, and nothing to do with what the tags looked like, or which parser you were using, or any of that mess. It may be HyTime that the RDF data model is analogous to, and not XML after all. Because of the many-fold annoyingnesses of RDF, it's far too easy for us to be distracted by issues which are to some greater or lesser extent syntactical (which arguably includes the 'literals as subjects' argument). One big point about RDF, for me, is that if you can prove (perhaps only to yourself) that you can round-trip information from its source, into triples, and back, then you have proven something useful and interesting about the RDF model in question, _even if_ you never actually publish those triples. If you can serialise those (abstract) triples into an EXIF header, a FITS header (my game), JSON, XML, HTML, or whatever, then you completely know the semantics of that serialisation, and that there isn't anything missing. The analogy with HyTime is therefore that it helps the understanding or discipline of a designer or standards author, not that it's anything anyone else would want to see. (HyTime was addressed to DTD authors, not SGML authors). Thus I can imagine setting up a service which stores its 'knowledge' as RDF-style triples (because that's a good fit to the heterogeneous nature of the information in question), but makes it available, and documented, only as JSON or microformats/RDFa. It's fundamentally an RDF application, but there isn't a single triple visible from the outside. Thus we might not have to change the spec, just drop our expectation that anyone who isn't an 'information architect'[1] will ever read it. Changing who we regard as the 'users' might also release us from some obsessing about creating user-accessible tools. Best wishes, Norman [1] O! Hats off! -- Norman Gray : http://nxg.me.uk