Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Jul 9, 2009, at 2:25 AM, Hugh Glaser h...@ecs.soton.ac.uk wrote: On 09/07/2009 00:38, Toby A Inkster t...@g5n.co.uk wrote: On 8 Jul 2009, at 19:58, Seth Russell wrote: Is it not true that everything past the hash (#alice) is not transmitted back to the server when a browser clicks on a hyperlink ? If that is true, then the server would not be able to serve anything different if a browser clicked upon http:// example.com/foaf.rdf or if they clicked upon http://example.com/ foaf.rdf#alice . Indeed - the server doesn't see the fragment. If that is true, and it probably isn't, then is not the Semantic Web crippled from using that techniqe to distinguish between resources and at the same time hyper linking between those different resources? Not at all. Is the web of documents crippled because the server can't distinguish between requests for http://example.com/document.html and http:// example.com/document.html#part2 ? Of course it isn't - the server doesn't need to distinguish between them - it serves up the same web page either way and lets the user agent distinguish. Hash URIs are very valuable in linked data, precisely *because* they can't be directly requested from a server - they allow us to bypass the whole HTTP 303 issue. Mind you, it does mean that you should make sure that you don't put too many LD URIs in one document. If dbpedia decided to represent all the RDF in one document, and then use hash URIs, it would be somewhat problematic. Could you explain why??? -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 9 Jul 2009, at 07:44, Juan Sequeda wrote: On Jul 9, 2009, at 2:25 AM, Hugh Glaser h...@ecs.soton.ac.uk wrote: Mind you, it does mean that you should make sure that you don't put too many LD URIs in one document. If dbpedia decided to represent all the RDF in one document, and then use hash URIs, it would be somewhat problematic. Could you explain why??? The very practical problem of file sizes. Your hash URIs can of course be distributed across a collection of files. e.g. http://example.com/~alice/foaf.rdf#me http://example.com/~bob/foaf.rdf#me http://example.com/~carol/foaf.rdf#me -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
In message EMEW3|b88ea541556c1ff93cb7842c018e2d08l681Q702hg|ecs.soton.ac.uk|C21D%hg @ecs.soton.ac.uk, Hugh Glaser h...@ecs.soton.ac.uk writes Hash URIs are very valuable in linked data, precisely *because* they can't be directly requested from a server - they allow us to bypass the whole HTTP 303 issue. Mind you, it does mean that you should make sure that you don't put too many LD URIs in one document. If dbpedia decided to represent all the RDF in one document, and then use hash URIs, it would be somewhat problematic. One aspect of this that puzzles me is how you do the deliver a human-readable or machine-processible version depending on the Accept header trick when the actual resource is a single RDF document containing hash-referenced assertions. Richard -- Richard Light
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hashed URIs can bring other problems. For example, if I have a service http://mydata.org/uri that takes a URI and returns what it knows about the thing identified by that URI and I pass it a hash URI, e.g. http://ci.nii.ac.jp/naid/110006281382#article , my browser will trim #article and send http://ci.nii.ac.jp/naid/110006281382 to my client. But, http://ci.nii.ac.jp/naid/110006281382 is NOT the URI of interest (it's a web page, http://ci.nii.ac.jp/naid/110006281382#article identifies the article itself). So, I have to fuss with URL encoding and Apache mod_rewrite to get http://ci.nii.ac.jp/naid/110006281382#article past the browser and web server and to my client. It's little things like this that make like *cough* interesting. Regards Rod - Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r.p...@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1...@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 09/07/2009 07:56, Peter Ansell ansell.pe...@gmail.com wrote: 2009/7/9 Juan Sequeda juanfeder...@gmail.com: On Jul 9, 2009, at 2:25 AM, Hugh Glaser h...@ecs.soton.ac.uk wrote: snip hash URI comments Mind you, it does mean that you should make sure that you don't put too many LD URIs in one document. If dbpedia decided to represent all the RDF in one document, and then use hash URIs, it would be somewhat problematic. Could you explain why??? Does it seem reasonable to have to trawl through millions (or billions) of RDF triples resolved from a large database that only used one base URI with fragment identifiers for everything else if you don't need to considering that 100 specific RDF triples in a compact document might have been all you needed to see? Peter As a concrete example: For dblp we split the data into year models, before asserting into the triplestore, so we can serve RDF for each URI, by sort of DESCRIBing. Paper: http://dblp.rkbexplorer.com/id/journals/expert/ShadboltGGHS04 comes from a model file: http://dblp.rkbexplorer.com/models/dblp-publications-2004.rdf which is 155MB. using hash URIs would require files of that size to be served for every access, although if we were actually doing it that way we would of course change our model file granularity size to avoid it. So there is both possible network and processing overhead, which can be got wrong. In fact large foaf files give you quite a lot of extra stuff, if all you wanted was some personal details. When you want to know about timbl, if you only wanted his blog address you don't necessarily want to download and process 30-odd KB of RDF, much of it details of the people he knows (such as Tom Ilube's URI). Just something to be aware of when serving linked data as hash. And to add something else to the mix. This is another reason semantic sitemaps are so important for search engines like Sindice. Sindice can index our model file, but on receiving a request for a URI in it, without the sitemap, all it could easily be able to do would be point the requester at the 155MB model file. Because of the sitemap, it can much more easily work out for itself what it needs to know about the URI to point the user at the linked data URI - all without spidering our whole triplestore, which would be unacceptable. Ah, the rich tapestry of life that is linked data! Best Hugh
Re: Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
(Discussing 303-redirect services, such as http://t-d-b.org/ or http://thing-described-by.org/ ) On Thu 09/07/09 6:12 AM , Olivier Rossel olivier.ros...@gmail.com sent: Externalizing the 303 feature is the good idea, imo. But such a service should also handle the content negociation feature. So the 303 may redirect to different URLs depending on the content negociated. This makes the service more complex internally but provides a very relevant service for RDF publishers (i.e they just have to take care of one config on their server : mime-types). That sounds like an interesting idea. It would require that URIs be registered in advance with the t-d-b.org server, so that it would know where to forward, depending on the Accept header (content negotiation), in a similar way that URIs are registered in advance with the purl.org server. Correction: come to think of it, that would not be necessary. Instead the t-d-b.org server could be configured to have one or more standard recipes available for converting a generic URI to a specific URI depending on the content type requested. For example t-d-b.org might have a recipe called conneg1 such that, given a GET request for URI http://t-d-b.org/conneg1?http://example/mydata if the Accept header indicates that RDF is preferred, then the server could 303-redirect to http://example/mydata.rdf (Note that the owner of that URI would still have to ensure that the MIME type is served correctly.) But if the Accept header indicates that HTML is preferred, then the server could 303-redirect to http://example/mydata.html Annother recipe, conneg2, might use a URI pattern, such that {} in the target URI is replaced by rdf or html. So for example, given a GET request for URI http://t-d-b.org/conneg1?http://example/{}/mydata if the Accept header indicates that RDF is preferred, then the server could 303-redirect to http://example/rdf/mydata But if the Accept header indicates that HTML is preferred, then the server could 303-redirect to http://example/html/mydata Note that a key advantage of this recipe-based approach is that it does not require the target URIs to be registered with the server. This is beneficial in three ways: - Easier to implement the server. - Easier for URI owners to use. - The initial HTTP request can be safely optimized away by smart clients, as described here: http://thing-described-by.org/#optimizing What kinds of recipes would be most useful to foks? Plus managing the redirect is as easy as changing the xml:base of their RDF/XML. That sounds like a somewhat different design than I sketched above. Can you describe in more detail what you mean, with an example? What would the t-d-b.org server do, and how would it know to do it? David Booth
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
I like this! :) However, some people will still be concerned about naming their resources under a domain that is not theirs. That is not only a matter of URI-prettiness, but also of relying on an external service, which may cease to exist tomorrow. However, this could easily be solved. All we would need is a PHP script that would behave just like t-d-b.org -- PHP having the advantage of working without any .htaccess fiddling, at least in most cases. So I could basically achieve the same thing with a URI like http://example.com/tdb.php?mydata.rdf or http://example.com/a_path/tdb.php?/another_path/mydata.rdf Of course, this would not prevent the use of specific recipies, like http://example.com/tdb1.php?mydata or http://example.com/tdb2.php?/{}/mydata pa Le 09/07/2009 14:43, David Booth a écrit : (Discussing 303-redirect services, such as http://t-d-b.org/ or http://thing-described-by.org/ ) On Thu 09/07/09 6:12 AM , Olivier Rossel olivier.ros...@gmail.com sent: Externalizing the 303 feature is the good idea, imo. But such a service should also handle the content negociation feature. So the 303 may redirect to different URLs depending on the content negociated. This makes the service more complex internally but provides a very relevant service for RDF publishers (i.e they just have to take care of one config on their server : mime-types). That sounds like an interesting idea. It would require that URIs be registered in advance with the t-d-b.org server, so that it would know where to forward, depending on the Accept header (content negotiation), in a similar way that URIs are registered in advance with the purl.org server. Correction: come to think of it, that would not be necessary. Instead the t-d-b.org server could be configured to have one or more standard recipes available for converting a generic URI to a specific URI depending on the content type requested. For example t-d-b.org might have a recipe called conneg1 such that, given a GET request for URI http://t-d-b.org/conneg1?http://example/mydata if the Accept header indicates that RDF is preferred, then the server could 303-redirect to http://example/mydata.rdf (Note that the owner of that URI would still have to ensure that the MIME type is served correctly.) But if the Accept header indicates that HTML is preferred, then the server could 303-redirect to http://example/mydata.html Annother recipe, conneg2, might use a URI pattern, such that {} in the target URI is replaced by rdf or html. So for example, given a GET request for URI http://t-d-b.org/conneg1?http://example/{}/mydata if the Accept header indicates that RDF is preferred, then the server could 303-redirect to http://example/rdf/mydata But if the Accept header indicates that HTML is preferred, then the server could 303-redirect to http://example/html/mydata Note that a key advantage of this recipe-based approach is that it does not require the target URIs to be registered with the server. This is beneficial in three ways: - Easier to implement the server. - Easier for URI owners to use. - The initial HTTP request can be safely optimized away by smart clients, as described here: http://thing-described-by.org/#optimizing What kinds of recipes would be most useful to foks? Plus managing the redirect is as easy as changing the xml:base of their RDF/XML. That sounds like a somewhat different design than I sketched above. Can you describe in more detail what you mean, with an example? What would the t-d-b.org server do, and how would it know to do it? David Booth
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Thu, Jul 9, 2009 at 10:46 AM, Pierre-Antoine Champinswlists-040...@champin.net wrote: However, some people will still be concerned about naming their resources under a domain that is not theirs. That is not only a matter of URI-prettiness, but also of relying on an external service, which may cease to exist tomorrow. I'm switching uridirector.praxisbridge.org[1] to optionally include accept headers in choosing a template. That should give people a quick low-effort[2] way to get up and running without having to warp their URIs to match a third party service (and without having to commit to using the service once another option is available) It seems pretty clear that people should (a) only mint URLs in domains the control and (b) maybe think about including a sub-domain in the URIs for specific data sets (and thereby get the power of the domain name system on their side when they need to move the data later on) Note that following (a) doesn't mean you need to run your own server, it's sufficient to just register the domain. Smart-ish redirectors (third party or local) will then allow you a lot of flexibility in choosing exactly where the data is located. -cks [1] Like purl o t-b-g, only with host name header recognition so you can CNAME your own domains over and maintain complete control over your URI, see previous email: http://lists.w3.org/Archives/Public/public-lod/2009Jul/0072.html It's not quite fully baked, but it's getting there. [2] You need to know what a CNAME is, and have access to your DNS configuration. But you're not minting URLs in domains you don't have administrative control over, are you? -- Christopher St. John c...@praxisbridge.com http://praxisbridge.com http://artofsystems.blogspot.com
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Externalizing the 303 feature is the good idea, imo. But such a service should also handle the content negociation feature. So the 303 may redirect to different URLs depending on the content negociated. This makes the service more complex internally but provides a very relevant service for RDF publishers (i.e they just have to take care of one config on their server : mime-types). Plus managing the redirect is as easy as changing the xml:base of their RDF/XML. On Wednesday, July 8, 2009, David Booth da...@dbooth.org wrote: On Wed 08/07/09 5:08 PM , Olivier Rossel olivier.ros...@gmail.com sent: Do you mean that all deferencable URIs of a RDF document should have their domain name to end with t-d-b.org, so their resolution leads to the TDB server which redirects to the final location? No, I'm not suggesting that *all* deferenceable RDF URIs should use t-d-b..org. I'm just pointing out that it is an alternative if you cannot configure your own server to do 303 redirects. Using it does require putting http://t-d-b.org?; at the beginning of your URI, so if you do not want to do that then you should use a different approach. To be clear, if you use this approach, then instead of writing a URI such as http://example/mydata.rdf you would write it as http://t-d-b.org?http://example/mydata.rdf and if that URI is dereferenced, the 303-redirect service will automatically return a 303 redirect to http://example/mydata.rdf David Booth On Wednesday, July 8, 2009, David Booth da...@dbooth .org wrote: On Wed, 2009-07-08 at 15:50 +0100, Pierre-Antoine Champin wrote: [ . . . ] ok, the solutions proposed here (by myself and others) still involve editing the .htaccess. Once again, use of a 303-redirect service such as http://thing-described-by.org/ or http://t-d-b.org/ does not require *any* configuration or .htaccess editing. It does not address the problem of setting the content type correctly, but it *does* provide an easy way to generate 303 redirects, in conformance with Cool URIs for the Semantic Web: http://www.w3.org/TR/cooluris/#r303gendocument Hmm, I thought the use of a 303-redirect service was mentioned in Cool URIs for the Semantic Web, but in looking back, I see it was in Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/TR/swbp-vocab-pub/#redirect Maybe it should be mentioned in a future version of the Cool URIs document as well. -- David Booth, Ph.D. Cleveland Clinic (contractor) Opinions expressed herein are those of the author and do not necessarily reflect those of Cleveland Clinic.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
DNS trickery is the ultimate step for a fully flexible architecture. Unfortunately it requires to have some admin rights over your own domain. Something uber difficult in companies. A workaround would be to create a top domain name, something like ..uris (or more realistically cooluris.net), with an automatic delegation of its subdomains to official owners of an existing domain name. For example the owner of datao.net would be able to get full access to the subdomain datao.net.uris (or datao.net.cooluris.net) for its URIs. And all the URIs of his or her RDF data would be in that domain. Then he would either CNAME it so it resolves to t-d-b.org or his/her own 303 system. And that service would then 303 to the real web servers of datao.net This would make a clean separation of contexts between URIs and URLs. I advocate the creation of a .uri top domain name. That would be the domain of semantic data. But because a top domain name is not something easy to get, we could consider something more classical. Maybe .cooluris.net The crucial point is that this domain name will delegate its subdomains to official owners of a real domain name (i.e datao.net can claim full control of datao.net.cooluris.net, for example). Given the fact that these services (CNAME to 303 + 303 to web data) have default behaviour that make things simple for beginners, we would have an efficient infrastructure. On Thursday, July 9, 2009, Christopher St John ckstj...@gmail.com wrote: On Thu, Jul 9, 2009 at 10:46 AM, Pierre-Antoine Champinswlists-040...@champin.net wrote: However, some people will still be concerned about naming their resources under a domain that is not theirs. That is not only a matter of URI-prettiness, but also of relying on an external service, which may cease to exist tomorrow. I'm switching uridirector.praxisbridge.org[1] to optionally include accept headers in choosing a template. That should give people a quick low-effort[2] way to get up and running without having to warp their URIs to match a third party service (and without having to commit to using the service once another option is available) It seems pretty clear that people should (a) only mint URLs in domains the control and (b) maybe think about including a sub-domain in the URIs for specific data sets (and thereby get the power of the domain name system on their side when they need to move the data later on) Note that following (a) doesn't mean you need to run your own server, it's sufficient to just register the domain. Smart-ish redirectors (third party or local) will then allow you a lot of flexibility in choosing exactly where the data is located. -cks [1] Like purl o t-b-g, only with host name header recognition so you can CNAME your own domains over and maintain complete control over your URI, see previous email: http://lists.w3.org/Archives/Public/public-lod/2009Jul/0072.html It's not quite fully baked, but it's getting there. [2] You need to know what a CNAME is, and have access to your DNS configuration. But you're not minting URLs in domains you don't have administrative control over, are you? -- Christopher St. John c...@praxisbridge.com http://praxisbridge.com http://artofsystems.blogspot.com
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Google has just changed the wording of the documentation: http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# The mentioning of cloaking risk is removed. While this is not final clearance, it is a nice sign that our concerns are heard. Best Martin Martin Hepp (UniBW) wrote: Dear all: Fyi - I am in contact with Google as for the clarification of what kind of empty div/span elements are considered acceptable in the context of RDFa. It may take a few days to get an official statement. Just so that you know it is being taken care of... Martin Mark Birbeck wrote: Hi Martin, b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. By coincidence, I just read this: Hidden div's -- don't do it! It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Google will not show content from hidden div's in Rich Snippets, and worse, this can be considered cloaking by Google's spam detection systems. [1] Regards, Mark [1] http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009 begin:vcard fn:Martin Hepp n:Hepp;Martin org:Bundeswehr University Munich;E-Business and Web Science Research Group adr:;;Werner-Heisenberg-Web 39;Neubiberg;;D-85577;Germany email;internet:mh...@computer.org tel;work:+49 89 6004 4217 tel;pager:skype: mfhepp url:http://www.heppnetz.de version:2.1 end:vcard
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Jul 5, 2009, at 10:16 AM, Hugh Glaser wrote: OK, I'll have a go :-) Why did I think this would be fun to do on a sunny Sunday morning that has turned into afternoon? Here are the instructions: And here is why I cannot follow them. 1. Create a web-accessible directory, let's say foobar, with all your .rdf, .ttl, .ntriples and .html files in it. 2. Copy lodpub.php and path.php into it. OK so far... 3. Access path.php from your web server. I can see this file, but I cannot access it. Attempting to do so gives me the message Can not open file .htaccess Reason: Could not download file (403:HTTP/1.1 403 forbidden) I have checked with my system admin, and they tell me, Yes that is correct. You cannot access your .htaccess file. You cannot modify it or paste anything into it. Only we have access to it. No, we will not change this policy for you, no matter how important you think you are. Although they do not say it openly, the implicit message is, we don't give a damn what the W3C thinks you ought to be able to do on our website. Now, has anyone got any OTHER ideas? An idea that does not involve changing any actual code, and so can be done using a text editor on an HTML text file, would be a very good option. Pat Hayes 4. Follow the instruction to paste that text into .htaccess 5. You can remove path.php if you like, it was only there to help you get the .htaccess right. That should be it. The above text and files are at http://www.rkbexplorer.com/blog/?p=11 Of course, I expect that you can tell me all sorts of problems/ better ways, but I am hoping it works for many. Some explanation: We use a different method, and I have tried to extract the essence, and keep the code very simple. We trap all 404 (File not Found) in the directory, and then any requests coming in for non-existent files will generate a 303 with an extension added, depending on the Accept header. Note that you probably need the leading / followed by the full path from the domain root, otherwise it will just print out the text lodpub.php; (That is not what the apache specs seem to say, but it is what seems to happen). If you get Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request., then it means that web server is not finding your ErrorDocument . Put the file path.php in the same directory and point your browser at it - this will tell you what the path should be. Note that the httpd.conf (in /etc/httpd/conf) may not let your override, if your admins have tied things down really tight. Mine says: AllowOverride All Finally, at the moment, note that I think that apache default does not put the correct MIME type on rdf files, but that is a separate issue, and it makes no difference that the 303 happened. Best Hugh On 05/07/2009 01:52, Pierre-Antoine Champin swlists-040...@champin.net wrote: Le 03/07/2009 15:14, Danny Ayers a écrit : 2009/7/2 Bill Robertsb...@swirrl.com: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. I've tried similar, even stuff using PURLs - incredibly difficult to get right. (My downtime overrides all, so I'm not even sure if I got it right in the end) I really think we need a (copy paste) cheat sheet. Volunteers? (raising my hand) :)* Here is a quick python script that makes it easier (if not completely immediate). It may still requires a one-liner .htaccess, but one that (I think) is authorized by most webmasters. I guess a PHP version would not even require that .htaccess, but sorry, I'm not fluent in PHP ;) So, assuming you want to publish a vocabulary with an RDF and an HTML description at http://example.com/mydir/myvoc, you need to: 1. Make `myvoc` a directory at the place where your HTTP server will serve it at the desired URI. 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if your server as WSGI support). 3. In the same directory, put two files named 'index.html' and 'index.rdf' If it does not work now (it didn't for me),you have to tell your HTTP server that the directory index is index.wsgi. In apache, this is done by creating (if not present) a `.htaccess` file in the `myvoc` diractory, and adding the following line:: DirectoryIndex index.cgi (or `index.wsgi`, accordingly) There is more docs in the script itself. I think the more recipes (including for other httpds) we can provide with the script, the more useful it will be. So feel free to propose other ones. enjoy pa path.phplodpub.php IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Pat, I have checked with my system admin, and they tell me, Yes that is correct. You cannot access your .htaccess file. You cannot modify it or paste anything into it. Only we have access to it. No, we will not change this policy for you, no matter how important you think you are. Although they do not say it openly, the implicit message is, we don't give a damn what the W3C thinks you ought to be able to do on our website. I agree that this seems to be getting like Groundhog Day. :) The original point of this thread seemed to me to be saying that if .htaccess is the key to the semantic web, then it's never going to happen. I.e., .htaccess is a major bottleneck. The initial discussion around that theme was then followed by all sorts of discussions about how people could create scripts that would choose between different files, and deliver the correct one to the user. But the fact remained -- as you rightly point out here -- that you still need to modify .htaccess. Now, has anyone got any OTHER ideas? An idea that does not involve changing any actual code, and so can be done using a text editor on an HTML text file, would be a very good option. :) Did I mention RDFa? Regards, Mark -- Mark Birbeck, webBackplane mark.birb...@webbackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Mark, disclaimer: I have nothing against the RDFa solution; I just don't think that one size fits all :) ok, the solutions proposed here (by myself and others) still involve editing the .htaccess. However, compared to configuring HTTP redirections using mod_rewrite, they have two advantages: - they are shorter and hopefully easier to adapt - they are more likely to be allowed for end users So I think it is a progress. Furthermore, some of the recipes may work without even touching the .htaccess file, providing that - executable files are automatically considered as CGI scripts - index.php is automatically considered as a directory index One size does not fit all, that is why we should provide several simple recipes in which people may find the one that works for them. This is why I'm asking (again) to IIS-users and (other httpd)-users to provide non apache recipes as well. Of course, the publish it in RDFa recipe is a perfectly legal one ! pa Le 08/07/2009 15:13, Mark Birbeck a écrit : Hi Pat, I have checked with my system admin, and they tell me, Yes that is correct. You cannot access your .htaccess file. You cannot modify it or paste anything into it. Only we have access to it. No, we will not change this policy for you, no matter how important you think you are. Although they do not say it openly, the implicit message is, we don't give a damn what the W3C thinks you ought to be able to do on our website. I agree that this seems to be getting like Groundhog Day. :) The original point of this thread seemed to me to be saying that if .htaccess is the key to the semantic web, then it's never going to happen. I.e., .htaccess is a major bottleneck. The initial discussion around that theme was then followed by all sorts of discussions about how people could create scripts that would choose between different files, and deliver the correct one to the user. But the fact remained -- as you rightly point out here -- that you still need to modify .htaccess. Now, has anyone got any OTHER ideas? An idea that does not involve changing any actual code, and so can be done using a text editor on an HTML text file, would be a very good option. :) Did I mention RDFa? Regards, Mark
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Wed, 2009-07-08 at 15:13 +0100, Mark Birbeck wrote: The original point of this thread seemed to me to be saying that if .htaccess is the key to the semantic web, then it's never going to happen. It simply isn't the key to the semantic web though. .htaccess is a simple way to configure Apache to do interesting things. It happens to give you a lot of power in deciding how requests for URLs should be translated into responses of data. If you have hosting which allows you such advanced control over your settings, and you can create nicer URLs, then by all means do so - and not just for RDF, but for all your URLs. It's a Good Thing to do, and in my opinion, worth switching hosts to achieve. But all that isn't necessary to publish linked data. If you own example.com, you can upload foaf.rdf and give yourself a URI like: http://example.com/foaf.rdf#alice (Or foaf.ttl, foaf.xhtml, whatever.) No, that's not as elegant as http://example.com/alice with a connection negotiated 303 redirect to representations in various formats, but it does work, and it won't break anything. Let's not blow this all out of proportion. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Wednesday, July 8, 2009, Toby Inkster t...@g5n.co.uk wrote: On Wed, 2009-07-08 at 15:13 +0100, Mark Birbeck wrote: The original point of this thread seemed to me to be saying that if .htaccess is the key to the semantic web, then it's never going to happen. It simply isn't the key to the semantic web though. .htaccess is a simple way to configure Apache to do interesting things. It happens to give you a lot of power in deciding how requests for URLs should be translated into responses of data. If you have hosting which allows you such advanced control over your settings, and you can create nicer URLs, then by all means do so - and not just for RDF, but for all your URLs. It's a Good Thing to do, and in my opinion, worth switching hosts to achieve. But all that isn't necessary to publish linked data. If you own example.com, you can upload foaf.rdf and give yourself a URI like: http://example.com/foaf.rdf#alice (Or foaf.ttl, foaf.xhtml, whatever.) This just works and is how the html web grew. Write a document and save it into a publuc spaxe. Fancy stuff like pretty URIs need more work but are not at all necessary for linked data or the semantic web. Let's not blow this all out of proportion. Hear hear! -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Wed, 2009-07-08 at 15:50 +0100, Pierre-Antoine Champin wrote: [ . . . ] ok, the solutions proposed here (by myself and others) still involve editing the .htaccess. Once again, use of a 303-redirect service such as http://thing-described-by.org/ or http://t-d-b.org/ does not require *any* configuration or .htaccess editing. It does not address the problem of setting the content type correctly, but it *does* provide an easy way to generate 303 redirects, in conformance with Cool URIs for the Semantic Web: http://www.w3.org/TR/cooluris/#r303gendocument Hmm, I thought the use of a 303-redirect service was mentioned in Cool URIs for the Semantic Web, but in looking back, I see it was in Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/TR/swbp-vocab-pub/#redirect Maybe it should be mentioned in a future version of the Cool URIs document as well. -- David Booth, Ph.D. Cleveland Clinic (contractor) Opinions expressed herein are those of the author and do not necessarily reflect those of Cleveland Clinic.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Sorry to hear that, Pat. On 08/07/2009 14:51, Pat Hayes pha...@ihmc.us wrote: On Jul 5, 2009, at 10:16 AM, Hugh Glaser wrote: OK, I'll have a go :-) Why did I think this would be fun to do on a sunny Sunday morning that has turned into afternoon? Here are the instructions: And here is why I cannot follow them. 1. Create a web-accessible directory, let's say foobar, with all your .rdf, .ttl, .ntriples and .html files in it. 2. Copy lodpub.php and path.php into it. OK so far... 3. Access path.php from your web server. I can see this file, but I cannot access it. Attempting to do so gives me the message Can not open file .htaccess Reason: Could not download file (403:HTTP/1.1 403 forbidden) Just a clarification, which probably doesn't help you, but just might. When you try to access path.php, you should either get some text in which the string htaccess appears (success), or some indication that you cannot access path.php or run php. I see no reason why you would get the message above trying to access path.php. (Unless somehow the attempt to run php has resulted in an attempt to access .htaccess because of a local issue, in which case the system is badly configured in its error reporting.) I guess that what you have seen is the result of creating a file called .htaccess on your local machine, and then trying to upload it to the server, using some sort of web-based upload facility? Best Hugh I have checked with my system admin, and they tell me, Yes that is correct. You cannot access your .htaccess file. You cannot modify it or paste anything into it. Only we have access to it. No, we will not change this policy for you, no matter how important you think you are. Although they do not say it openly, the implicit message is, we don't give a damn what the W3C thinks you ought to be able to do on our website. Now, has anyone got any OTHER ideas? An idea that does not involve changing any actual code, and so can be done using a text editor on an HTML text file, would be a very good option. Pat Hayes 4. Follow the instruction to paste that text into .htaccess 5. You can remove path.php if you like, it was only there to help you get the .htaccess right. That should be it. The above text and files are at http://www.rkbexplorer.com/blog/?p=11 Of course, I expect that you can tell me all sorts of problems/ better ways, but I am hoping it works for many. Some explanation: We use a different method, and I have tried to extract the essence, and keep the code very simple. We trap all 404 (File not Found) in the directory, and then any requests coming in for non-existent files will generate a 303 with an extension added, depending on the Accept header. Note that you probably need the leading / followed by the full path from the domain root, otherwise it will just print out the text lodpub.php; (That is not what the apache specs seem to say, but it is what seems to happen). If you get Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request., then it means that web server is not finding your ErrorDocument . Put the file path.php in the same directory and point your browser at it - this will tell you what the path should be. Note that the httpd.conf (in /etc/httpd/conf) may not let your override, if your admins have tied things down really tight. Mine says: AllowOverride All Finally, at the moment, note that I think that apache default does not put the correct MIME type on rdf files, but that is a separate issue, and it makes no difference that the 303 happened. Best Hugh On 05/07/2009 01:52, Pierre-Antoine Champin swlists-040...@champin.net wrote: Le 03/07/2009 15:14, Danny Ayers a écrit : 2009/7/2 Bill Robertsb...@swirrl.com: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. I've tried similar, even stuff using PURLs - incredibly difficult to get right. (My downtime overrides all, so I'm not even sure if I got it right in the end) I really think we need a (copy paste) cheat sheet. Volunteers? (raising my hand) :)* Here is a quick python script that makes it easier (if not completely immediate). It may still requires a one-liner .htaccess, but one that (I think) is authorized by most webmasters. I guess a PHP version would not even require that .htaccess, but sorry, I'm not fluent in PHP ;) So, assuming you want to publish a vocabulary with an RDF and an HTML description at http://example.com/mydir/myvoc, you need to: 1. Make `myvoc` a directory at the place where your HTTP server will serve it at the desired URI. 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if
Re: Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Wed 08/07/09 5:08 PM , Olivier Rossel olivier.ros...@gmail.com sent: Do you mean that all deferencable URIs of a RDF document should have their domain name to end with t-d-b.org, so their resolution leads to the TDB server which redirects to the final location? No, I'm not suggesting that *all* deferenceable RDF URIs should use t-d-b.org. I'm just pointing out that it is an alternative if you cannot configure your own server to do 303 redirects. Using it does require putting http://t-d-b.org?; at the beginning of your URI, so if you do not want to do that then you should use a different approach. To be clear, if you use this approach, then instead of writing a URI such as http://example/mydata.rdf you would write it as http://t-d-b.org?http://example/mydata.rdf and if that URI is dereferenced, the 303-redirect service will automatically return a 303 redirect to http://example/mydata.rdf David Booth On Wednesday, July 8, 2009, David Booth da...@dbooth .org wrote: On Wed, 2009-07-08 at 15:50 +0100, Pierre-Antoine Champin wrote: [ . . . ] ok, the solutions proposed here (by myself and others) still involve editing the .htaccess. Once again, use of a 303-redirect service such as http://thing-described-by.org/ or http://t-d-b.org/ does not require *any* configuration or .htaccess editing. It does not address the problem of setting the content type correctly, but it *does* provide an easy way to generate 303 redirects, in conformance with Cool URIs for the Semantic Web: http://www.w3.org/TR/cooluris/#r303gendocument Hmm, I thought the use of a 303-redirect service was mentioned in Cool URIs for the Semantic Web, but in looking back, I see it was in Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/TR/swbp-vocab-pub/#redirect Maybe it should be mentioned in a future version of the Cool URIs document as well. -- David Booth, Ph.D. Cleveland Clinic (contractor) Opinions expressed herein are those of the author and do not necessarily reflect those of Cleveland Clinic.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 8 Jul 2009, at 19:58, Seth Russell wrote: Is it not true that everything past the hash (#alice) is not transmitted back to the server when a browser clicks on a hyperlink ? If that is true, then the server would not be able to serve anything different if a browser clicked upon http:// example.com/foaf.rdf or if they clicked upon http://example.com/ foaf.rdf#alice . Indeed - the server doesn't see the fragment. If that is true, and it probably isn't, then is not the Semantic Web crippled from using that techniqe to distinguish between resources and at the same time hyper linking between those different resources? Not at all. Is the web of documents crippled because the server can't distinguish between requests for http://example.com/document.html and http:// example.com/document.html#part2 ? Of course it isn't - the server doesn't need to distinguish between them - it serves up the same web page either way and lets the user agent distinguish. Hash URIs are very valuable in linked data, precisely *because* they can't be directly requested from a server - they allow us to bypass the whole HTTP 303 issue. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 09/07/2009 00:38, Toby A Inkster t...@g5n.co.uk wrote: On 8 Jul 2009, at 19:58, Seth Russell wrote: Is it not true that everything past the hash (#alice) is not transmitted back to the server when a browser clicks on a hyperlink ? If that is true, then the server would not be able to serve anything different if a browser clicked upon http:// example.com/foaf.rdf or if they clicked upon http://example.com/ foaf.rdf#alice . Indeed - the server doesn't see the fragment. If that is true, and it probably isn't, then is not the Semantic Web crippled from using that techniqe to distinguish between resources and at the same time hyper linking between those different resources? Not at all. Is the web of documents crippled because the server can't distinguish between requests for http://example.com/document.html and http:// example.com/document.html#part2 ? Of course it isn't - the server doesn't need to distinguish between them - it serves up the same web page either way and lets the user agent distinguish. Hash URIs are very valuable in linked data, precisely *because* they can't be directly requested from a server - they allow us to bypass the whole HTTP 303 issue. Mind you, it does mean that you should make sure that you don't put too many LD URIs in one document. If dbpedia decided to represent all the RDF in one document, and then use hash URIs, it would be somewhat problematic. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Martin, all, I would like to point to something that might be useful for RDF data publishing. The ReDeFer RDF2HTML service (http://rhizomik.net/redefer/) renders input RDF/XML data as HTML for user interaction (e.g. as used in http://rhizomik.net/rhizomer/). Now, it also embeds RDFa that facilitates retrieving the source RDF back. I've tested it with a pair of GoodRelations examples: http://rhizomik.net/redefer-services/rdf2html?rdf=http://www.heppnetz.de/projects/goodrelations/minimalExampleGoodRelations.owl http://rhizomik.net/redefer-services/rdf2html?rdf=http://www.heppnetz.de/projects/goodrelations/goodrelationsExamplesPrimerFinalOWL.owl I've been able to check that it works for these examples by comparing the triples generated by RDFa Distiller and RDFa Bookmarklet from the previous HTML+RDFa pages to those generated by any23 and Triplr from the original OWL files. The generated HTML+RDFa can be then used in order to publish RDF just by CutPaste, e.g. using an online editor like FCKEditor. This has been the procedure followed in order to publish the RDF in http://rhizomik.net/redefer/rdf2html/minimalExampleGoodRelations/ The HTML+RDFa view might be customised using CSS and made more usable if the source RDF contains rdfs:labels for the involved resources, which are used instead of the last part of the URIs if available. In any case, if it is no to be shown to the user, it is easier to just model triples using hidden spans instead of using this service... Best regards, Roberto García http://rhizomik.net/~roberto PD: Caution, this is work in progress. Feedback appreciated :-) On Wed, Jul 8, 2009 at 12:59 PM, Martin Hepp (UniBW)martin.h...@ebusiness-unibw.org wrote: Google has just changed the wording of the documentation: http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# The mentioning of cloaking risk is removed. While this is not final clearance, it is a nice sign that our concerns are heard. Best Martin Martin Hepp (UniBW) wrote: Dear all: Fyi - I am in contact with Google as for the clarification of what kind of empty div/span elements are considered acceptable in the context of RDFa. It may take a few days to get an official statement. Just so that you know it is being taken care of... Martin Mark Birbeck wrote: Hi Martin, b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. By coincidence, I just read this: Hidden div's -- don't do it! It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Google will not show content from hidden div's in Rich Snippets, and worse, this can be considered cloaking by Google's spam detection systems. [1] Regards, Mark [1] http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
In message 4a50ad9f.9030...@champin.net, Pierre-Antoine Champin swlists-040...@champin.net writes PS: any IIS user volunteering to translate those recipies to IIS configuration? I have implemented the 303 redirection strategy in IIS, but using a custom 404 page not found error handler. Is that relevant to this discussion? Richard -- Richard Light
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Le 05/07/2009 13:54, Toby A Inkster a écrit : On 5 Jul 2009, at 01:52, Pierre-Antoine Champin wrote: I guess a PHP version would not even require that .htaccess, but sorry, I'm not fluent in PHP ;) The situation with PHP should be much the same, though I suppose web hosts might be more likely to set index.php in the DirectoryIndex as a default. this was my intuition as well. However, I actually have to add the DirectoryIndex directive to have index.php taken into account on my server. PHP has another advantage over CGI (and WSGI): you can usually run a PHP script from any directory of your hosted space, while CGI are usually confined in a special directory. Anyway, I've done a quick port of your code to PHP. (I stripped out your connection negotiation code and replaced it with my own, as I figured out it would be faster to paste in the ConNeg class I'm familiar with rather than do line-by-line porting of the Python to PHP.) Here it is, same license - LGPL 3. great :) We should start a repository somewhere of useful code for serving linked data. I agree. I note that your implementation uses absolute URIs for redirection. This has two main advantages over mine: - this complies with the RFC (I had missed that part ;) - this still works when you append path elements after the script name (which messes the relative URI in my script) pa
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
We should start a repository somewhere of useful code for serving linked data. I agree. (I raise my hand) If I am not wrong, this thread has given out 4 different implementations for serving linked data. I mentioned before that I wanted to post this on linkeddata.org I will work out the logistics with Tom Heath, so we can put upload the code examples hopefully this week! Juan Sequeda
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
OK, I'll have a go :-) Why did I think this would be fun to do on a sunny Sunday morning that has turned into afternoon? Here are the instructions: 1. Create a web-accessible directory, let's say foobar, with all your .rdf, .ttl, .ntriples and .html files in it. 2. Copy lodpub.php and path.php into it. 3. Access path.php from your web server. 4. Follow the instruction to paste that text into .htaccess 5. You can remove path.php if you like, it was only there to help you get the .htaccess right. That should be it. The above text and files are at http://www.rkbexplorer.com/blog/?p=11 Of course, I expect that you can tell me all sorts of problems/better ways, but I am hoping it works for many. Some explanation: We use a different method, and I have tried to extract the essence, and keep the code very simple. We trap all 404 (File not Found) in the directory, and then any requests coming in for non-existent files will generate a 303 with an extension added, depending on the Accept header. Note that you probably need the leading / followed by the full path from the domain root, otherwise it will just print out the text lodpub.php; (That is not what the apache specs seem to say, but it is what seems to happen). If you get Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request., then it means that web server is not finding your ErrorDocument . Put the file path.php in the same directory and point your browser at it - this will tell you what the path should be. Note that the httpd.conf (in /etc/httpd/conf) may not let your override, if your admins have tied things down really tight. Mine says: AllowOverride All Finally, at the moment, note that I think that apache default does not put the correct MIME type on rdf files, but that is a separate issue, and it makes no difference that the 303 happened. Best Hugh On 05/07/2009 01:52, Pierre-Antoine Champin swlists-040...@champin.net wrote: Le 03/07/2009 15:14, Danny Ayers a écrit : 2009/7/2 Bill Robertsb...@swirrl.com: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. I've tried similar, even stuff using PURLs - incredibly difficult to get right. (My downtime overrides all, so I'm not even sure if I got it right in the end) I really think we need a (copy paste) cheat sheet. Volunteers? (raising my hand) :)* Here is a quick python script that makes it easier (if not completely immediate). It may still requires a one-liner .htaccess, but one that (I think) is authorized by most webmasters. I guess a PHP version would not even require that .htaccess, but sorry, I'm not fluent in PHP ;) So, assuming you want to publish a vocabulary with an RDF and an HTML description at http://example.com/mydir/myvoc, you need to: 1. Make `myvoc` a directory at the place where your HTTP server will serve it at the desired URI. 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if your server as WSGI support). 3. In the same directory, put two files named 'index.html' and 'index.rdf' If it does not work now (it didn't for me),you have to tell your HTTP server that the directory index is index.wsgi. In apache, this is done by creating (if not present) a `.htaccess` file in the `myvoc` diractory, and adding the following line:: DirectoryIndex index.cgi (or `index.wsgi`, accordingly) There is more docs in the script itself. I think the more recipes (including for other httpds) we can provide with the script, the more useful it will be. So feel free to propose other ones. enjoy pa attachment: path.php attachment: lodpub.php
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
yay!! more easy-lod goodness! more incentive to get this up on linkeddata.org this week! do we have any volunteers for ruby? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sun, Jul 5, 2009 at 5:16 PM, Hugh Glaser h...@ecs.soton.ac.uk wrote: OK, I'll have a go :-) Why did I think this would be fun to do on a sunny Sunday morning that has turned into afternoon? Here are the instructions: 1. Create a web-accessible directory, let's say foobar, with all your .rdf, .ttl, .ntriples and .html files in it. 2. Copy lodpub.php and path.php into it. 3. Access path.php from your web server. 4. Follow the instruction to paste that text into .htaccess 5. You can remove path.php if you like, it was only there to help you get the .htaccess right. That should be it. The above text and files are at http://www.rkbexplorer.com/blog/?p=11 Of course, I expect that you can tell me all sorts of problems/better ways, but I am hoping it works for many. Some explanation: We use a different method, and I have tried to extract the essence, and keep the code very simple. We trap all 404 (File not Found) in the directory, and then any requests coming in for non-existent files will generate a 303 with an extension added, depending on the Accept header. Note that you probably need the leading / followed by the full path from the domain root, otherwise it will just print out the text lodpub.php; (That is not what the apache specs seem to say, but it is what seems to happen). If you get Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request., then it means that web server is not finding your ErrorDocument . Put the file path.php in the same directory and point your browser at it - this will tell you what the path should be. Note that the httpd.conf (in /etc/httpd/conf) may not let your override, if your admins have tied things down really tight. Mine says: AllowOverride All Finally, at the moment, note that I think that apache default does not put the correct MIME type on rdf files, but that is a separate issue, and it makes no difference that the 303 happened. Best Hugh On 05/07/2009 01:52, Pierre-Antoine Champin swlists-040...@champin.net wrote: Le 03/07/2009 15:14, Danny Ayers a écrit : 2009/7/2 Bill Robertsb...@swirrl.com: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. I've tried similar, even stuff using PURLs - incredibly difficult to get right. (My downtime overrides all, so I'm not even sure if I got it right in the end) I really think we need a (copy paste) cheat sheet. Volunteers? (raising my hand) :)* Here is a quick python script that makes it easier (if not completely immediate). It may still requires a one-liner .htaccess, but one that (I think) is authorized by most webmasters. I guess a PHP version would not even require that .htaccess, but sorry, I'm not fluent in PHP ;) So, assuming you want to publish a vocabulary with an RDF and an HTML description at http://example.com/mydir/myvoc, you need to: 1. Make `myvoc` a directory at the place where your HTTP server will serve it at the desired URI. 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if your server as WSGI support). 3. In the same directory, put two files named 'index.html' and 'index.rdf' If it does not work now (it didn't for me),you have to tell your HTTP server that the directory index is index.wsgi. In apache, this is done by creating (if not present) a `.htaccess` file in the `myvoc` diractory, and adding the following line:: DirectoryIndex index.cgi (or `index.wsgi`, accordingly) There is more docs in the script itself. I think the more recipes (including for other httpds) we can provide with the script, the more useful it will be. So feel free to propose other ones. enjoy pa
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Le 03/07/2009 15:14, Danny Ayers a écrit : 2009/7/2 Bill Robertsb...@swirrl.com: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. I've tried similar, even stuff using PURLs - incredibly difficult to get right. (My downtime overrides all, so I'm not even sure if I got it right in the end) I really think we need a (copy paste) cheat sheet. Volunteers? (raising my hand) :)* Here is a quick python script that makes it easier (if not completely immediate). It may still requires a one-liner .htaccess, but one that (I think) is authorized by most webmasters. I guess a PHP version would not even require that .htaccess, but sorry, I'm not fluent in PHP ;) So, assuming you want to publish a vocabulary with an RDF and an HTML description at http://example.com/mydir/myvoc, you need to: 1. Make `myvoc` a directory at the place where your HTTP server will serve it at the desired URI. 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if your server as WSGI support). 3. In the same directory, put two files named 'index.html' and 'index.rdf' If it does not work now (it didn't for me),you have to tell your HTTP server that the directory index is index.wsgi. In apache, this is done by creating (if not present) a `.htaccess` file in the `myvoc` diractory, and adding the following line:: DirectoryIndex index.cgi (or `index.wsgi`, accordingly) There is more docs in the script itself. I think the more recipes (including for other httpds) we can provide with the script, the more useful it will be. So feel free to propose other ones. enjoy pa #!/usr/bin/env python #EasyPub: easy publication of RDF vocabulary #Copyright (C) 2009 Pierre-Antoine Champin pcham...@liris.cnrs.fr # #EasyPub is free software: you can redistribute it and/or modify #it under the terms of the GNU Lesser General Public License as published #by the Free Software Foundation, either version 3 of the License, or #(at your option) any later version. # #KTBS is distributed in the hope that it will be useful, #but WITHOUT ANY WARRANTY; without even the implied warranty of #MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #GNU Lesser General Public License for more details. # #You should have received a copy of the GNU Lesser General Public License #along with KTBS. If not, see http://www.gnu.org/licenses/. This is a drop-in CGI/WSGI script for publishing RDF vocabulary. Quick start === Assuming you want to publish the vocabulary http://example.com/mydir/myvoc, the reciepe with the most chances to work is the following: 1. Make `myvoc` a directory at a place where your HTTP server will serve it at the desired URI. 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if your server as WSGI support). 3. In the same directory, put two files named 'index.html' and 'index.rdf' At this point, it may work (if you are lucky), or may have to tell your HTTP server that the directory index (i.e. the file to serve for the bare directory) is index.wsgi. In apache, this is done by creating (if not present) a `.htaccess` file in the `myvoc` diractory, and adding the following line:: DirectoryIndex index.cgi (or `index.wsgi`, accordingly) Fortunately, this option is allowed to end-users by most webmasters. More generaly = The script will redirect, according to the Accept HTTP header, to a file with the same name but a different extension. The file may have no extension at all, so the following layout would work as well:: mydir/myvoc (the script) mydir/myvoc.html mydir/myvoc.rdf However, the tricky part is to convince the HTTP server to consider `myvoc` (an extension-less file) as a CGI script (a thing in which I didn't succeed for the moment...). The interesting feature of such a config is that it would support slash-based vocabulary. For example, http://example.com/mydir/myvoc/MyTerm would still redirect to the html or rdf file. This would not work with the reciep. This would not work with the `index.cgi` recipe. The script is can be configured to serve different files or support other mime types by altering the `MAPPING` constant below. # the list below maps mime-types to redirection URL; %s is to be replaced by # the script name (without its extension); note that the order may be # significant (when matching */*) MAPPING = [ (text/html, %s.html), (application/rdf+xml, %s.rdf), ## uncomment the following if applicable #(application/turtle, %s.ttl), #(text/n3, %s.n3), ] HTML_REDIRECT = html headtitleNon-Information Resource/title/head body h1Non-Information Resource/h1 You should be redirected to a href=%s%s/a. /body /html HTML_NOT_ACCEPTABLE
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
2009/7/2 Bill Roberts b...@swirrl.com: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. I've tried similar, even stuff using PURLs - incredibly difficult to get right. (My downtime overrides all, so I'm not even sure if I got it right in the end) I really think we need a (copy paste) cheat sheet. Volunteers? Cheers, Danny. -- http://danny.ayers.name
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Discussion on this seems to have died down. I've tried to follow this thread but do not have enough SW and RDF knowledge to understand all that was said. But I'd like to learn by being able to publish RDF versions of knowledge in a way that is discoverable and usable by others in LOD fashion. Could someone summarise this thread in a single (unbiased?) post, please? With the main points being: a) what is/are the blocks on LOD via RDF; b) how does RDFa help and what are its own failings; c) what are the recipes for making data discoverable, linkable and usable if i) one has full access to a server; ii) one has only user directory acccess to a server; iii) one does not know or care what a server is. Many thanks, from me and I'm sure many others, to anyone who can satisfy these requests. Cheers, Tony. -- Tony Linde Project Manager Department of Physics Astronomy University of Leicester From: Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org Reply-To: martin.h...@ebusiness-unibw.org Date: Wed, 1 Jul 2009 16:51:15 +0100 To: Mark Birbeck mark.birb...@webbackplane.com Cc: public-lod@w3.org, semantic-web at W3C semantic-...@w3c.org Subject: Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation Dear all: Fyi - I am in contact with Google as for the clarification of what kind of empty div/span elements are considered acceptable in the context of RDFa. It may take a few days to get an official statement. Just so that you know it is being taken care of... Martin Mark Birbeck wrote: Hi Martin, b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. By coincidence, I just read this: Hidden div's -- don't do it! It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Google will not show content from hidden div's in Rich Snippets, and worse, this can be considered cloaking by Google's spam detection systems. [1] Regards, Mark [1] http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
2009/7/2 Linde, A.E. ae...@leicester.ac.uk: Could someone summarise this thread in a single (unbiased?) post, please? I'll try to answer the questions, even though I've only skimmed the thread... a) what is/are the blocks on LOD via RDF The vast majority of publication tools and supporting services are geared towards publishing HTML. While a key piece of Web architecture is the the ability to publish multiple representations of a given resource (e.g. both HTML and RDF/XML format documents with a single URI through content negotiation), the mechanisms needed to do this are often unavailable from regular hosting services. Similarly the redirect handling needed to provide a description of a resource that cannot appear directly on the Web - things, people etc - is also not possible. Typically these would be done through using .htaccess files on Apache. b) how does RDFa help and what are its own failings; RDFa allows the RDF to be published in a HTML document, so content negotiation isn't needed. You get two representations in one. Again tool support is a problem, although with RDFa being a new spec the situation is bound to improve. GRDDL may also be a useful alternative if the source data is available in an XML format. c) what are the recipes for making data discoverable, linkable and usable there are recipes at: http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ though perhaps a cheat sheet would be a good idea? if i) one has full access to a server; this is pretty well documented, e.g. as above ii) one has only user directory acccess to a server; while this may often be the same as i) generally I'd suggest it's a case-by-case thing, depending on the web server configuration iii) one does not know or care what a server is. Depending on the nature of the data, it may be possible to use one of the semweb-enabled document-first publishing tools (a semantic wiki or CMS). Alternately a relational DB to RDF mapping tool may help. But best bet right now though would be to have a word with someone offering linked data publishing services - Talis or OpenLink, may be others. I've no doubt missed a lot of points and alternate approaches, but I these were top of my own mental heap :) Cheers, Danny. -- http://danny.ayers.name
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. So... after an hour or so of messing around, I still couldn't get Apache based linked data content negotiation to work properly. (Something to do with turning off MultiViews which in turn meant fiddling with AllowOverride). I had more pressing things to do so I gave up. Anyway, I conclude that I agree with Martin that this is not in general an easy way to set up content negotiation! And I had full access to all the Apache conf files - without that I wouldn't have got anywhere. In contrast, last year I wrote some code to do linked data content negotiation in a Ruby on Rails app, which was pretty easy. Regards Bill
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Bill, Is your code to do the content negotation in RoR available somewhere? I'm trying to come up with example code to put up (sometime soon) on the linkeddata.org site. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Thu, Jul 2, 2009 at 8:19 PM, Bill Roberts b...@swirrl.com wrote: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. So... after an hour or so of messing around, I still couldn't get Apache based linked data content negotiation to work properly. (Something to do with turning off MultiViews which in turn meant fiddling with AllowOverride). I had more pressing things to do so I gave up. Anyway, I conclude that I agree with Martin that this is not in general an easy way to set up content negotiation! And I had full access to all the Apache conf files - without that I wouldn't have got anywhere. In contrast, last year I wrote some code to do linked data content negotiation in a Ruby on Rails app, which was pretty easy. Regards Bill
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi guys, Have you looked at Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/2001/sw/BestPractices/VM/http-examples/2006-01-18/ Peter Juan Sequeda wrote: Hi Bill, Is your code to do the content negotation in RoR available somewhere? I'm trying to come up with example code to put up (sometime soon) on the linkeddata.org http://linkeddata.org site. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Thu, Jul 2, 2009 at 8:19 PM, Bill Roberts b...@swirrl.com mailto:b...@swirrl.com wrote: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. So... after an hour or so of messing around, I still couldn't get Apache based linked data content negotiation to work properly. (Something to do with turning off MultiViews which in turn meant fiddling with AllowOverride). I had more pressing things to do so I gave up. Anyway, I conclude that I agree with Martin that this is not in general an easy way to set up content negotiation! And I had full access to all the Apache conf files - without that I wouldn't have got anywhere. In contrast, last year I wrote some code to do linked data content negotiation in a Ruby on Rails app, which was pretty easy. Regards Bill
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi, One solution for this is for someone to create and distribute a simple to deploy Linked Data server with integrated CN that can cover common personal ( introductory ) use cases and eventually scale to enterprise demands. And maybe it could even be opensource and already packaged to be deployed via Amazon EC2. Oh wait...! Regards, A PS. And then, someone else could build an alternative, validate the market, etc. The same old story ;) -- Aldo Bucchi skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com/ PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Peter Mika wrote: Hi guys, Have you looked at Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/2001/sw/BestPractices/VM/http-examples/2006-01-18/ Peter Ivan (as W3C rep.), We have a W3C article titled: Best Practice Recipes for Publishing RDF Vocabularies Abstract reads: This document describes best practice recipes for publishing an RDFS or OWL vocabulary or ontology on the Web. The features of each recipe are clearly described, so that vocabulary or ontology creators may choose the recipe best suited to the needs of their particular situations. Each recipe contains an example configuration for use with an Apache HTTP server, although the principles involved may be adapted to other environments. The recipes are all designed to be consistent with the architecture of the Web as currently specified. I think the W3C really have to decide if this is an Apache Guide or a general Web guide. Right now its an Apache guide, so why not correct the title so it reads: Best Practice Recipes for Publishing RDF Vocabularies *using Apache*. The Web of Linked Data is simply not about Apache, and I believe you all know that. Thus, what's the value in producing collateral that - by title and abstract -- implies inextricable binding of the Web and Apache. Lets make things clearer, the clearer things are the better for the Web of Linked Data or Linked Data Web as a whole. Kingsley Juan Sequeda wrote: Hi Bill, Is your code to do the content negotation in RoR available somewhere? I'm trying to come up with example code to put up (sometime soon) on the linkeddata.org http://linkeddata.org site. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Thu, Jul 2, 2009 at 8:19 PM, Bill Roberts b...@swirrl.com mailto:b...@swirrl.com wrote: I thought I'd give the .htaccess approach a try, to see what's involved in actually setting it up. I'm no expert on Apache, but I know the basics of how it works, I've got full access to a web server and I can read the online Apache documentation as well as the next person. So... after an hour or so of messing around, I still couldn't get Apache based linked data content negotiation to work properly. (Something to do with turning off MultiViews which in turn meant fiddling with AllowOverride). I had more pressing things to do so I gave up. Anyway, I conclude that I agree with Martin that this is not in general an easy way to set up content negotiation! And I had full access to all the Apache conf files - without that I wouldn't have got anywhere. In contrast, last year I wrote some code to do linked data content negotiation in a Ruby on Rails app, which was pretty easy. Regards Bill -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Tom: Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. We seem to agree on the observation, but not on the conclusion. What I want and suggest is using RDFa also for exchanging a bit more complex RDF models / data by simply using a lot of div / span or whatever elements that represent the RDF part in the SAME document BUT NOT too closely linked with the presentation level. body h1This is the car I want to sell/h1 Actually, a pretty cool car, for only $1.000. Offer valid through July 31, 2009 span ... my whole RDF in RDFa /span body The advantage of that would be that - you just have to maintain ONE file, - data and metadata are close by, so the likelihood of being up to date increases, and - at the same time, the code does not get too messy. - Also - no problems setting up the server (*). - Easy to create on-line tools that generate RDFa snippets for simple pasting. - Yahoo and Google will most likely honor RDFa meta-data only. Also note that often the literal values will be in content attributes anyway, because the string for the presentation is not suitable as meta-data content anyway (e.g. dates, country codes,...) I think the approach sketched above would be a cheap and useful way of publishing RDF meta-data. It could work with CMS / blogging software etc. Imaging if we were able to allow eBay sellers to put GoodRelations meta-data directly into the open XHTML part of their product description. The main problem with my proposal is that there is the risk that Google considers this cloaking and may remove respective resources from their index (Mark raised that issue). If that risk was confirmed, we would really have a problem. Imagine me selling Semantic Web markup as a step beyond SEO ... and the first consequence of following my advice is being removed from the Google index. A second problem is that if the document contains nodes that have no counterpart on the presentation level (e.g. intermediate nodes for holding n-ary relations), then they will also not be dereferencable. The same holds for URIs or nodes that are outside the scope of the actual RDFa / XHTML document - I see no simple way of serving neither XHTML nor RDF content for those. Best Martin Tom Heath wrote: Martin, 2009/6/27 Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. Tom. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009 begin:vcard fn:Martin Hepp n:Hepp;Martin org:Bundeswehr University Munich;E-Business and Web Science Research Group adr:;;Werner-Heisenberg-Web 39;Neubiberg;;D-85577;Germany email;internet:mh...@computer.org tel;work:+49 89 6004 4217 tel;pager:skype: mfhepp url:http://www.heppnetz.de version:2.1 end:vcard
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Martin, 2009/6/29 Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org: Hi Tom: Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. We seem to agree on the observation, but not on the conclusion. What I want and suggest is using RDFa also for exchanging a bit more complex RDF models / data by simply using a lot of div / span or whatever elements that represent the RDF part in the SAME document BUT NOT too closely linked with the presentation level. body h1This is the car I want to sell/h1 Actually, a pretty cool car, for only $1.000. Offer valid through July 31, 2009 span ... my whole RDF in RDFa /span body The advantage of that would be that - you just have to maintain ONE file, - data and metadata are close by, so the likelihood of being up to date increases, and - at the same time, the code does not get too messy. - Also - no problems setting up the server (*). - Easy to create on-line tools that generate RDFa snippets for simple pasting. - Yahoo and Google will most likely honor RDFa meta-data only. Also note that often the literal values will be in content attributes anyway, because the string for the presentation is not suitable as meta-data content anyway (e.g. dates, country codes,...) I think the approach sketched above would be a cheap and useful way of publishing RDF meta-data. It could work with CMS / blogging software etc. Imaging if we were able to allow eBay sellers to put GoodRelations meta-data directly into the open XHTML part of their product description. The main problem with my proposal is that there is the risk that Google considers this cloaking and may remove respective resources from their index (Mark raised that issue). If that risk was confirmed, we would really have a problem. Imagine me selling Semantic Web markup as a step beyond SEO ... and the first consequence of following my advice is being removed from the Google index. A second problem is that if the document contains nodes that have no counterpart on the presentation level (e.g. intermediate nodes for holding n-ary relations), then they will also not be dereferencable. The same holds for URIs or nodes that are outside the scope of the actual RDFa / XHTML document - I see no simple way of serving neither XHTML nor RDF content for those. These are exactly the reasons why I emphasise the limitations and ask that we don't oversell the capabilities of any technology, RDFa included. Tom.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Martin Hepp (UniBW) wrote: Hi Tom: Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. We seem to agree on the observation, but not on the conclusion. What I want and suggest is using RDFa also for exchanging a bit more complex RDF models / data by simply using a lot of div / span or whatever elements that represent the RDF part in the SAME document BUT NOT too closely linked with the presentation level. body h1This is the car I want to sell/h1 Actually, a pretty cool car, for only $1.000. Offer valid through July 31, 2009 span ... my whole RDF in RDFa /span body The advantage of that would be that - you just have to maintain ONE file, - data and metadata are close by, so the likelihood of being up to date increases, and - at the same time, the code does not get too messy. - Also - no problems setting up the server (*). - Easy to create on-line tools that generate RDFa snippets for simple pasting. - Yahoo and Google will most likely honor RDFa meta-data only. Also note that often the literal values will be in content attributes anyway, because the string for the presentation is not suitable as meta-data content anyway (e.g. dates, country codes,...) I think the approach sketched above would be a cheap and useful way of publishing RDF meta-data. It could work with CMS / blogging software etc. Imaging if we were able to allow eBay sellers to put GoodRelations meta-data directly into the open XHTML part of their product description. The main problem with my proposal is that there is the risk that Google considers this cloaking and may remove respective resources from their index (Mark raised that issue). If that risk was confirmed, we would really have a problem. Imagine me selling Semantic Web markup as a step beyond SEO ... and the first consequence of following my advice is being removed from the Google index. A second problem is that if the document contains nodes that have no counterpart on the presentation level (e.g. intermediate nodes for holding n-ary relations), then they will also not be dereferencable. The same holds for URIs or nodes that are outside the scope of the actual RDFa / XHTML document - I see no simple way of serving neither XHTML nor RDF content for those. Martin, If Google doesn't see invisible DIVs as cloaking, the issue vaporizes. Also, if people take the SEO + SDQ (Linked Data Expressed in RDFa) approach they will at least remain in the Google index via usual SEO oriented keyword gimmickry, albeit generally suboptimal. If we make a recipe doc showcasing these issues, we will more than likely get Google to recalibrate back to the Web; especially if we can demonstrate that other search engine players --that have support RDFa -- not being afflicted with the same cloaking myopia. Kingsley Best Martin Tom Heath wrote: Martin, 2009/6/27 Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. Tom. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Tom, On Sun, Jun 28, 2009 at 11:46 PM, Tom Heathtom.he...@talis.com wrote: Martin, 2009/6/27 Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. Mmm...you put me in a difficult position here. :) If I leap to RDFa's defence then it looks like I think it solves all the world's problems. But if I remain silent, then it looks like the problem being raised is some kind of fundamental flaw. Ah well, let's dive in... First I should say that I'd be the first to agree that RDFa has limitations. But the issue here is that I don't think the problem raised by Martin can be classed as a limitation in the way you're implying, Tom. If we go back a step, RDFa was carefully designed so that it could carry any combination of the RDF concepts in an HTML document. In the end we dropped reification and lists, because it didn't seem that the RDF community itself was clear on the future of those, but they are both easily added back if the issues were to be resolved. In short, it is possible to use HTML+RDFa to create complete RDF documents, such as RDF Schemas, OWL ontologies, and so on, and the resulting documents would be no more complex than their equivalent RDF/XML or N3 versions, with the benefit that they can be delivered using any of the many HTML publishing techniques currently available. But most of the discussion around RDFa relates to its other use, where it's possible to use it to 'sprinkle' metadata into HTML documents that are primarily aimed at human readers. By being alongside the human-readable output, it makes the metadata easier to maintain. And in addition it gives the user agent the opportunity to enhance the view of the data, by making use of the 'local' metadata. However, the point that Martin was getting at, is that sometimes there is just way more data in the 'RDF view' than in the 'human view', and that makes it very difficult to make the two align. I don't think that this is a flaw in RDFa itself, and I'm not convinced that there is an easy solution in the form of another technology that would solve this. Martin's solution seems a reasonable one to me. (Although I wonder if part of the problem might be that too much information is being provided in the RDF view, rather than using links to other data that can be retrieved. Perhaps Michael could give an example.) Regards, Mark -- Mark Birbeck, webBackplane mark.birb...@webbackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Yihong: I am a big fan of Codd's one fact in one place credo. However, in this particular case, that principle is violated anyway, since the literal values are often duplicated for presentation and meta-data prupolses anyway (think of 2009-06-29 vs. June 29, 2009). Second, for dynamic Web apps, it does not really matter whether the same fact is exposed once or twice, since the central location is one place in the database anyway. Third, this is the only way how a tool like the GoodRelations annotator [1] can create RDFa snippets for simple copy-and-paste into existing pages. Also note that in the particular case of RDFa, the principle of one fact in one place clashes with the separation of concerns principle, in particular, that of keeping data and presentation separate. The textbook-style beauty of simplicity of RDFa holds for adding a dc:creator property to a string value that is the same for presentation and at the data level. Beyond that, RDFa can create code that is very hard to maintain. In fact, I know that a large software company dismissed the use of RDFa in their products because of the unmanageable mix of conceptual and presentation layer. As far as security is concerned: I there is no real difference in my proposal, as the content attribute of RDFa allows serving different data to human and to machines, and this is a needed feature anyway. Digital signatures at the document or element level and / or data provenance approached will likely cater for that. Best Martin Yihong Ding wrote: Hi Kingley and Martin, A potential problem of the model Martin suggested is that the same data has to be presented at least TWICE in one document. Although the RDFa portion of the data is supposed to be automatically generated, it, however, does not prohibit anybody from manually revising it. Therefore, it leaves a huge hole for the hackers (or anybody who want to do some deceptive job). In our imperfect world, this problem is severe. Adding an extra layer of data mapping always causes additional work on data maintenance. This time, the extra work could be a nightmare though the architecture is neat. yihong On Mon, Jun 29, 2009 at 8:03 AM, Kingsley Idehen kide...@openlinksw.comwrote: Martin Hepp (UniBW) wrote: Hi Tom: Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. We seem to agree on the observation, but not on the conclusion. What I want and suggest is using RDFa also for exchanging a bit more complex RDF models / data by simply using a lot of div / span or whatever elements that represent the RDF part in the SAME document BUT NOT too closely linked with the presentation level. body h1This is the car I want to sell/h1 Actually, a pretty cool car, for only $1.000. Offer valid through July 31, 2009 span ... my whole RDF in RDFa /span body The advantage of that would be that - you just have to maintain ONE file, - data and metadata are close by, so the likelihood of being up to date increases, and - at the same time, the code does not get too messy. - Also - no problems setting up the server (*). - Easy to create on-line tools that generate RDFa snippets for simple pasting. - Yahoo and Google will most likely honor RDFa meta-data only. Also note that often the literal values will be in content attributes anyway, because the string for the presentation is not suitable as meta-data content anyway (e.g. dates, country codes,...) I think the approach sketched above would be a cheap and useful way of publishing RDF meta-data. It could work with CMS / blogging software etc. Imaging if we were able to allow eBay sellers to put GoodRelations meta-data directly into the open XHTML part of their product description. The main problem with my proposal is that there is the risk that Google considers this cloaking and may remove respective resources from their index (Mark raised that issue). If that risk was confirmed, we would really have a problem. Imagine me selling Semantic Web markup as a step beyond SEO ... and the first consequence of following my advice is being removed from the Google index. A second problem is that if the document contains nodes that have no counterpart on the presentation level (e.g. intermediate nodes for holding n-ary relations), then they will also not be dereferencable. The same holds for URIs or nodes that are outside the scope of the actual RDFa / XHTML document - I see no simple way of serving neither XHTML nor RDF content for those. Martin, If Google doesn't see invisible DIVs as cloaking, the issue vaporizes. Also, if people take the SEO + SDQ (Linked Data Expressed in RDFa) approach they will at least remain in the Google index via usual SEO oriented keyword gimmickry, albeit generally suboptimal. If we make a recipe doc showcasing these issues, we will more than likely get Google to recalibrate back to the Web;
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Mon, 2009-06-29 at 13:30 +0100, Mark Birbeck wrote: If we go back a step, RDFa was carefully designed so that it could carry any combination of the RDF concepts in an HTML document. In the end we dropped reification and lists, because it didn't seem that the RDF community itself was clear on the future of those, but they are both easily added back if the issues were to be resolved. RDF reification and lists do *work* in RDFa, they're just a bit of a pain to mark up. e.g. here's a reification: div xmlns:dc=http://purl.org/dc/elements/1.1/; xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:db=http://dbpedia.org/resource/; typeof=rdf:Statement span property=dc:creatorMark Birkbeck/span says that span rel=rdf:subject resource=[db:Sky]the sky/span span rel=rdf:predicate resource=http://dbpedia.org/property/color; is/span span rel=rdf:object resource=[db:Blue]blue/span. /div And an example of a list can be found here: http://ontologi.es/rail/routes/gb/VTB1.xhtml -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Toby, Yes...of course...you are right. :) I would say too, that reification is even more long-winded than the example you have given! You don't have the actual statement the sky is blue in your mark-up, so you need even more RDFa. (You only have the statement Mark says 'the sky is blue'.) But either way, you are right that the whole thing can be spelt out longhand (as can lists). The only reason I mentioned it was because for a long time in RDFa we had a much simpler construct based on occurrences of *nested* meta and link properties. However, some browsers thought they were doing us a favour by moving the meta and link elements out of the body and into the head, which meant it was not possible to implement this feature in JavaScript. (Obviously server-side RDFa parsers would have had no problem with it.) As for lists, the obvious shorthand would be ol, ul, and li, but it was not obvious what triples should be generated, so we left it. I.e., your example uses the first/next/nil technique for collections, but of course there is also the rdf:_1 technique for a list. It wasn't immediately clear which would be the more useful -- or conformant -- one to generate. Regards, Mark On Mon, Jun 29, 2009 at 2:05 PM, Toby Inkstert...@g5n.co.uk wrote: On Mon, 2009-06-29 at 13:30 +0100, Mark Birbeck wrote: If we go back a step, RDFa was carefully designed so that it could carry any combination of the RDF concepts in an HTML document. In the end we dropped reification and lists, because it didn't seem that the RDF community itself was clear on the future of those, but they are both easily added back if the issues were to be resolved. RDF reification and lists do *work* in RDFa, they're just a bit of a pain to mark up. e.g. here's a reification: div xmlns:dc=http://purl.org/dc/elements/1.1/; xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:db=http://dbpedia.org/resource/; typeof=rdf:Statement span property=dc:creatorMark Birkbeck/span says that span rel=rdf:subject resource=[db:Sky]the sky/span span rel=rdf:predicate resource=http://dbpedia.org/property/color; is/span span rel=rdf:object resource=[db:Blue]blue/span. /div And an example of a list can be found here: http://ontologi.es/rail/routes/gb/VTB1.xhtml -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk -- Mark Birbeck, webBackplane mark.birb...@webbackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Mark, 2009/6/29 Mark Birbeck mark.birb...@webbackplane.com: Hi Tom, On Sun, Jun 28, 2009 at 11:46 PM, Tom Heathtom.he...@talis.com wrote: Martin, 2009/6/27 Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. Mmm...you put me in a difficult position here. :) ;) If I leap to RDFa's defence then it looks like I think it solves all the world's problems. But if I remain silent, then it looks like the problem being raised is some kind of fundamental flaw. Just in case there's any doubt, let me clarify that this isn't an anti-RDFa position from me, just trying to unpack the issue. Ah well, let's dive in... First I should say that I'd be the first to agree that RDFa has limitations. But the issue here is that I don't think the problem raised by Martin can be classed as a limitation in the way you're implying, Tom. If we go back a step, RDFa was carefully designed so that it could carry any combination of the RDF concepts in an HTML document. In the end we dropped reification and lists, because it didn't seem that the RDF community itself was clear on the future of those, but they are both easily added back if the issues were to be resolved. In short, it is possible to use HTML+RDFa to create complete RDF documents, such as RDF Schemas, OWL ontologies, and so on, and the resulting documents would be no more complex than their equivalent RDF/XML or N3 versions, with the benefit that they can be delivered using any of the many HTML publishing techniques currently available. Absolutely agreed. I don't dispute this at all. Though it's not really my point. See below... But most of the discussion around RDFa relates to its other use, where it's possible to use it to 'sprinkle' metadata into HTML documents that are primarily aimed at human readers. By being alongside the human-readable output, it makes the metadata easier to maintain. In some cases. It depends on the publishing architecture. What effect does it have on the maintenance cost of the layout/structural markup of the page? And in addition it gives the user agent the opportunity to enhance the view of the data, by making use of the 'local' metadata. However, the point that Martin was getting at, is that sometimes there is just way more data in the 'RDF view' than in the 'human view', and that makes it very difficult to make the two align. Yes, this is exactly how I understood his point. It's also exactly why I keep banging on about us not saying that x is better than y. It's not about a limitation of RDFa as a technology (apologies if it came across that way), simply a reflection of the fact that it can be challenging to deploy in some circumstances. Again, this is context dependent, and the best solution can only be determined by examining that context. I don't think that this is a flaw in RDFa itself, Agreed. and I'm not convinced that there is an easy solution in the form of another technology that would solve this. Well, such cases may justify the 303/conneg pattern. Martin's solution seems a reasonable one to me. (Although I wonder if part of the problem might be that too much information is being provided in the RDF view, rather than using links to other data that can be retrieved. Perhaps Michael could give an example.) Completely agreed on this point. You'll see this approach manifested in Revyu.com, where there is redundancy in data between HTML pages for the sake of presenting human users with a more complete view (without requiring them to visit multiple pages); the same is not true of the (broadly) equivalent RDF documents, where I tried to avoid redundancy, on the basis that any SW agent worth it's salt should be able to dereference the referenced URIs to retrieve the data it needs. IIRC others disagree with my approach here (TimBL? Richard C?), but this speaks completely to the question of what is the appropriate interaction paradigm for apps built on the Web of Data. If we can understand the answers to this question then it may help guide our deployment strategies for RDFa. Cheers, Tom.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Jun 28, 2009, at 6:39 PM, Tim Berners-Lee wrote: On 2009-06 -25, at 13:29, Pat Hayes wrote: On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. (Was yours a deliberate sarcastic misrepresentation of the TAG's consensus, or a genuine misunderstanding? A genuine misunderstanding, based on the personal feedback I got, I admit, rather than a careful perusal of the TAG published decisions, my bad. ) The TAG has expressed that the fact that Apache needs root intervention when it doesn't have the right mime type set up is a serious bug. Well, Im glad to hear that, and apologize for not knowing it. But as I said in my reply to Tom, that doesn't help me actually use the SWeb from out here in the one-way side roads off the information superhighway. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. So you mean you can't set up content negotiation and redirection. Right. As I discovered when I was trying to follow the http-range-14 decision and experiment with my notorious 'PatHayes' self-referential page, in order to bring it into line with the recommendations. Talk about eating dog food... But you can use foo#bar URIs like I do. True. Will the company allow a mime.types file to include application/rdf +xml? No problem there, AFAIK. Pat Tim IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202 4416 office Pensacola(850)202 4440 fax FL 32502 (850)291 0667 mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Jun 28, 2009, at 6:20 PM, Tom Heath wrote: Hi Pat, 2009/6/25 Pat Hayes pha...@ihmc.us: With the sincerest respect, Tom, your attitude here is part of the problem. Maybe, along with many other people, I am indeed still stuck in the mid-1990s. You have permission to be as condescending as you like. But still, here I am, stuck. Thoroughly stuck. So no amount of condescending sooo-20th-century, my dear chatter is going to actually enable me to get to a place where I can do what you think I should be doing. Condescension was never my intention here. My goal was to draw a comparison that might enable us to learn a lesson from the history of the Web and use that to help us move forward. As Mark described, over the course of time more and more tools became available that made it easier to publish HTML. Presumably these only arose because publishing HTML was to some degree hard. The Web community has gone through this process once already; let's learn the lessons from last time and apply them to publishing RDF so people don't have to be stuck any more. Um.. I thought that was MY point :-) Dan outlined some technical approaches to doing this sort of thing. Some domain-specific apps already exist that (hopefully) reduce the pain; it was one of the goals of Revyu.com for example. I cannot use a rewrite rule to catch incoming requests, or do whatever you are talking about here. I live in an environment where I simply do not have access at all to the workings of my server at a level that close to the metal, because it is already woven into a clever maze of PHP machinery which is too fragile to allow erks like me to mess with it. Some of the best W3C techies have taken a look, and they can't find a way through it, either. Maybe Im in a special position, but I bet a whole lot of people, especially in the corporate world, are in a similar bind. You're talking about two very different groups here. If the right tools are created then individuals will presumably adopt some specialised SaaS analogous to say wordpress.com. Corporations are a different kettle of fish I work for a small research company which happens to have an ambitious Webmaster and a Director who is sensitive to visual graphics and Web image issues. The result is a maze of complex PHP giving users a very nice experience, but not conducive to transparent use by its inhabitants. Just from casual Web browsing, I cannot believe that I am in a very small minority. There are a lot of 'sexy' sites out there that must be in a similar state. I know that several 'web authoring' systems produce similar PHP mazes, because Ive tried using them and then editing the output they produce, an experience rather like debugging BCPL. , but just as many built their own Web-serving infrastructure in the 90s, so they will invest in publishing data to the Semantic Web if they perceive adequate value (demonstrating that value is where we need to be working even harder!). System level access to a server is quite a different beast than being allowed to publish HTML on a website somewhere. I can, and do, publish HTML, or indeed just about any file I like, but I don't get to insert code. So 6 lines or 600, it makes no difference. But in any case, this is ridiculous. RDF is just XML text, for goodness sake. I need to insert lines of code into a server file, and write PHP scripts, in order to publish some RDF or HTML? That is insane. It would have been insane in the mid-1990s and its even more insane now. No. This is incorrect. This discussion only applies to the 303-redirect/slash URI pattern. You can avoid this completely by using the hash URI pattern as someone mentioned (sorry for not crediting directly, getting hard to navigate this thread). Yes, of course, and I apologize for overstating the case. Still, the slash-URI seems to be much more acceptable to many unsemantic Webbies, who are used to thinking of URIs as being stripped of their post-hash content at the slightest internet shiver, so don't regard a name including a hash as something 'real'; and it is the case about which all the fuss is being made. If the published advice was: always use hash URI patterns, I would be happy. But the published advice *starts* with 303 redirects and .htaccess file modifications. IMO, it is you (and Tim and the rest of the W3C) who are stuck in the past here. Most Web users do not, and will not, write code. They will be publishing content in a cloud somewhere, even further away from the gritty world of scripts and lines of code than people - most people - are now. Most actual content providers are never going to want to even know that PHP scripts exist, let alone be obliged to write or copy one. You've over-interpreted my words here. See above. If so, I apologise. But think of what Im saying as a cry for help. There are a lot of people like me, I suspect, who would really like
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Mon, 2009-06-29 at 01:20 +0200, Tom Heath wrote: [ . . . ] This discussion only applies to the 303-redirect/slash URI pattern. You can avoid this completely by using the hash URI pattern . . . . And as a reminder, you can also use a 303-redirect service if you cannot configure your server, such as: http://thing-described-by.org/ For example, http://thing-described-by.org?http://dbooth.org/2005/dbooth/ does a 303 redirect to http://dbooth.org/2005/dbooth/ That last one doesn't happen to serve RDF, but it certainly could. -- David Booth, Ph.D. Cleveland Clinic (contractor) Opinions expressed herein are those of the author and do not necessarily reflect those of Cleveland Clinic.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Pat, OK, yelling heard loud and clear :) By way of concrete actions, I gave Ivan Herman a (probably unfairly) hard time today here at Dagstuhl to 'encourage' the authors of the Vocabs Best Practices to press on with the revision of that document that addresses the current issues. An update of the How to Publish Linked Data on the Web tutorial is also on the cards; perhaps one of the outcomes of this revision could be a greater emphasis on the hash URI pattern (and maybe also the 'health warning' you describe ;). Cheers, Tom. 2009/6/29 Pat Hayes pha...@ihmc.us: On Jun 28, 2009, at 6:20 PM, Tom Heath wrote: Hi Pat, 2009/6/25 Pat Hayes pha...@ihmc.us: With the sincerest respect, Tom, your attitude here is part of the problem. Maybe, along with many other people, I am indeed still stuck in the mid-1990s. You have permission to be as condescending as you like. But still, here I am, stuck. Thoroughly stuck. So no amount of condescending sooo-20th-century, my dear chatter is going to actually enable me to get to a place where I can do what you think I should be doing. Condescension was never my intention here. My goal was to draw a comparison that might enable us to learn a lesson from the history of the Web and use that to help us move forward. As Mark described, over the course of time more and more tools became available that made it easier to publish HTML. Presumably these only arose because publishing HTML was to some degree hard. The Web community has gone through this process once already; let's learn the lessons from last time and apply them to publishing RDF so people don't have to be stuck any more. Um.. I thought that was MY point :-) Dan outlined some technical approaches to doing this sort of thing. Some domain-specific apps already exist that (hopefully) reduce the pain; it was one of the goals of Revyu.com for example. I cannot use a rewrite rule to catch incoming requests, or do whatever you are talking about here. I live in an environment where I simply do not have access at all to the workings of my server at a level that close to the metal, because it is already woven into a clever maze of PHP machinery which is too fragile to allow erks like me to mess with it. Some of the best W3C techies have taken a look, and they can't find a way through it, either. Maybe Im in a special position, but I bet a whole lot of people, especially in the corporate world, are in a similar bind. You're talking about two very different groups here. If the right tools are created then individuals will presumably adopt some specialised SaaS analogous to say wordpress.com. Corporations are a different kettle of fish I work for a small research company which happens to have an ambitious Webmaster and a Director who is sensitive to visual graphics and Web image issues. The result is a maze of complex PHP giving users a very nice experience, but not conducive to transparent use by its inhabitants. Just from casual Web browsing, I cannot believe that I am in a very small minority. There are a lot of 'sexy' sites out there that must be in a similar state. I know that several 'web authoring' systems produce similar PHP mazes, because Ive tried using them and then editing the output they produce, an experience rather like debugging BCPL. , but just as many built their own Web-serving infrastructure in the 90s, so they will invest in publishing data to the Semantic Web if they perceive adequate value (demonstrating that value is where we need to be working even harder!). System level access to a server is quite a different beast than being allowed to publish HTML on a website somewhere. I can, and do, publish HTML, or indeed just about any file I like, but I don't get to insert code. So 6 lines or 600, it makes no difference. But in any case, this is ridiculous. RDF is just XML text, for goodness sake. I need to insert lines of code into a server file, and write PHP scripts, in order to publish some RDF or HTML? That is insane. It would have been insane in the mid-1990s and its even more insane now. No. This is incorrect. This discussion only applies to the 303-redirect/slash URI pattern. You can avoid this completely by using the hash URI pattern as someone mentioned (sorry for not crediting directly, getting hard to navigate this thread). Yes, of course, and I apologize for overstating the case. Still, the slash-URI seems to be much more acceptable to many unsemantic Webbies, who are used to thinking of URIs as being stripped of their post-hash content at the slightest internet shiver, so don't regard a name including a hash as something 'real'; and it is the case about which all the fuss is being made. If the published advice was: always use hash URI patterns, I would be happy. But the published advice *starts* with 303 redirects and .htaccess file modifications. IMO, it is you (and Tim and the rest of the W3C) who are
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Richard, 2009/6/25 Richard Cyganiak rich...@cyganiak.de: snip/ (On the value of content negotiation in general: I think the key point is that any linked data URI intended for re-use, when put into a browser by the average person interested in linked data publishing, MUST return something human-readable. That's a hard requirement, otherwise people will never be confident about what a particular URI means and hence they won't re-use. That was the thinking behind the Cool URIs note when Leo and I wrote it a few years ago. In the past, the only way to get that effect was with content negotiation, so even though content negotiation is a pain, it's what we had to do. In the present, we have an alternative thanks to RDFa. Not disagreeing at all about the human readable requirement, but just a question... in this scenario you describe, is there not a risk that Joe User will enter that URI and come to the conclusion that it identifies the document (or section thereof), rather than a thing described in the document? Interested in your thoughts :) Tom.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Martin, 2009/6/27 Martin Hepp (UniBW) martin.h...@ebusiness-unibw.org: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. Amen. Thank you for writing this. I completely agree. RDFa has some great use cases but (like any technology) has its limitations. Let's not oversell it. Tom.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 2009-06 -25, at 13:29, Pat Hayes wrote: On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. (Was yours a deliberate sarcastic misrepresentation of the TAG's consensus, or a genuine misunderstanding?) The TAG has expressed that the fact that Apache needs root intervention when it doesn't have the right mime type set up is a serious bug. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. So you mean you can't set up content negotiation and redirection. But you can use foo#bar URIs like I do. Will the company allow a mime.types file to include application/rdf+xml? Tim
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 6/28/09 6:33 PM, Tom Heath tom.he...@talis.com wrote: Hi Richard, 2009/6/25 Richard Cyganiak rich...@cyganiak.de: snip/ (On the value of content negotiation in general: I think the key point is that any linked data URI intended for re-use, when put into a browser by the average person interested in linked data publishing, MUST return something human-readable. That's a hard requirement, otherwise people will never be confident about what a particular URI means and hence they won't re-use. That was the thinking behind the Cool URIs note when Leo and I wrote it a few years ago. In the past, the only way to get that effect was with content negotiation, so even though content negotiation is a pain, it's what we had to do. In the present, we have an alternative thanks to RDFa. Not disagreeing at all about the human readable requirement, but just a question... in this scenario you describe, is there not a risk that Joe User will enter that URI and come to the conclusion that it identifies the document (or section thereof), rather than a thing described in the document? Interested in your thoughts :) Tom. Tom, Of course not, if dealing with an HTTP URI deployed in line with the Linked Data meme's deployment guidelines. In short, the user will encounter a document describing the Thing identified by the URI. The issue is not the document, but what it represents (metadata) and how it comes to be associated (implicitly) with the entity (resource) it describes via the entities URI. When all is said and done, the Linked Data meme has simply used HTTP to fix an age old problem: implicit association of an Entity with its Metadata within the context of distributed computing without any platform lock-in. Rewind back to pre. Web, then ask yourself: how did programmers refer to data objects and de-reference their representations (typically a proprietary language and platform specific data structure). Again, Linked Data is just about making what was platform specific, platform independent, via HTTP i.e., data access by reference and data manipulation by values exposed by de-dereferenced data structures. We really need to keep this quite simple. There are zillions of people that understand data access by reference etc.. They also understand Metadata etc.. What they don't understand is how we sometimes *inadvertently* make this whole Linekd Data meme thing complex by not connecting the meme to what existed before the Web (which was actually created on a computer that already had a fully functional distributed object based OS etc..). Kingsley
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Sat, Jun 27, 2009 at 9:21 AM, Martin Hepp (UniBW)martin.h...@ebusiness-unibw.org wrote: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. And you give up the clear separation of concerns between the conceptual level and the presentation level that XML brought about. Maybe one should tell Google that this is not cloaking if SW meta-data is embedded... But the snippet basically indicates that we should not recommend this practice. What happens if you put them in one big span tree and use the @content attribute? Martin Kingsley Idehen wrote: Mark Birbeck wrote: Hi Martin, b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. By coincidence, I just read this: Hidden div's -- don't do it! It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Google will not show content from hidden div's in Rich Snippets, and worse, this can be considered cloaking by Google's spam detection systems. [1] Regards, Mark [1] http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# Martin/Mark, Time to make a sample RDFa doc that includes very detailed GR based metadata. Mark: Should we be describing our docs for Google, fundamentally? I really think Google should actually recalibrate back to the Web etc.. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Martin Hepp (UniBW) wrote: So if this hidden div / span approach is not feasible, we got a problem. The reason is that, as beautiful the idea is of using RDFa to make a) the human-readable presentation and b) the machine-readable meta-data link to the same literals, the problematic is it in reality once the structure of a) and b) are very different. For very simple property-value pairs, embedding RDFa markup is no problem. But if you have a bit more complexity at the conceptual level and in particular if there are significant differences to the structure of the presentation (e.g. in terms of granularity, ordering of elements, etc.), it gets very, very messy and hard to maintain. And you give up the clear separation of concerns between the conceptual level and the presentation level that XML brought about. Maybe one should tell Google that this is not cloaking if SW meta-data is embedded... Yes. Ideally, they should figure that out from the self-describing nature of the RDF based metadata exposed by the embedded RDFa -- assuming they are doing real RDFa processing :-) Kingsley But the snippet basically indicates that we should not recommend this practice. Martin Kingsley Idehen wrote: Mark Birbeck wrote: Hi Martin, b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. By coincidence, I just read this: Hidden div's -- don't do it! It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Google will not show content from hidden div's in Rich Snippets, and worse, this can be considered cloaking by Google's spam detection systems. [1] Regards, Mark [1] http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# Martin/Mark, Time to make a sample RDFa doc that includes very detailed GR based metadata. Mark: Should we be describing our docs for Google, fundamentally? I really think Google should actually recalibrate back to the Web etc.. -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 27 Jun 2009, at 11:25, Melvin Carvalho wrote: What happens if you put them in one big span tree and use the @content attribute? view-source:http://ontologi.es/rail/routes/gb/VTB1.xhtml -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 25 Jun 2009, at 21:18, Pat Hayes wrote: If [RDF] requires people to tinker with files with names starting with a dot [...] then the entire SWeb architecture is fundamentally broken. RDF doesn't. Apache does. Many hosts do have front ends for configuring Apache, allowing redirects to be set up and content-types configured by filling in simple web forms. But there are such a variety of these tools with different capabilities and different interfaces that it would be difficult to produce advice suitable for them all, so instead .htaccess recipes are provided instead. That said, there are a couple of steps that Martin could remove from his recipe and still be promoting reasonably good practice: Step 5a - this rewrites http://example.org/semanticweb to http:// example.org/semanticweb.rdf. Other than aesthetics, there's no real reason to do this. Yes, I've read timbl's old Cool URIs document, and understand about not wanting to include hints of file format in a URI. But realistically, this file is going to always include some RDF - perhaps in a non-RDF/XML serialisation, but I don't see anything inappropriate about serving other RDF serialisations using a .rdf URL, provided the correct MIME type is used. Step 5b - the default Apache mime.types file knows about application/ rdf+xml, so this should be unnecessary. Perhaps instead have a GoodRelations validator which checks that the content type is correct, and only suggests this when it is found to be otherwise. Steps 3 and 4 could be amalgamated into a single validate your RDF file step using the aforementioned validator. The validator would be written so that, upon a successful validation, it offers single-click options to ping semweb search engines, and Yahoo (via a RDF/XML- DataRSS converter). With those adjustments, the recipe would just be: 1. Upload your RDF file. 2. Add a rel=meta link to it. 3. Validate using our helpful tool. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Fri, 2009-06-26 at 09:35 +0200, Dan Brickley wrote: Does every major RDF toolkit have an integrated RDFa parser already? No - and even for those that do, it's often rather flaky. Seseme/Rio doesn't have one in its stable release, though I believe one is in development for 3.0. Redland/Raptor often (for me at least) seems to crash on RDFa. It also complains a lot when named entities are used (e.g. nbsp;) even though the XHTML+RDFa 1.0 DTD does allow them. Jena (just testing on sparql.org) doesn't seem to handle RDFa at all. Not really toolkits per se, but cwm and the current release of Tabulator don't seem to have RDFa support. (Though I think support for the latter is being worked on.) For application developers who are specifically trying to support RDFa, none of this is a major problem - it's pretty easy to include a little content-type detection and pass the XHTML through an RDFa-XML converter prior to the rest of your code getting its hands on it - but this does require specific handling, which must be an obstacle to adoption. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 26/6/09 10:51, Toby Inkster wrote: On Fri, 2009-06-26 at 09:35 +0200, Dan Brickley wrote: Does every major RDF toolkit have an integrated RDFa parser already? No - and even for those that do, it's often rather flaky. Seseme/Rio doesn't have one in its stable release, though I believe one is in development for 3.0. Redland/Raptor often (for me at least) seems to crash on RDFa. It also complains a lot when named entities are used (e.g.nbsp;) even though the XHTML+RDFa 1.0 DTD does allow them. Jena (just testing on sparql.org) doesn't seem to handle RDFa at all. Not really toolkits per se, but cwm and the current release of Tabulator don't seem to have RDFa support. (Though I think support for the latter is being worked on.) For application developers who are specifically trying to support RDFa, none of this is a major problem - it's pretty easy to include a little content-type detection and pass the XHTML through an RDFa-XML converter prior to the rest of your code getting its hands on it - but this does require specific handling, which must be an obstacle to adoption. Yep, pretty much as I feared. Also the Google SGAPI currently only reads FOAF in RDF/XML form, not yet updated to use the rdfa support in Rapper. Re app developers, it depends a lot. If your app is built inside some framework - eg. Protege - RDFa might be quite hard to integrate. Some apps also store to local disk rather than HTTP space, and so using content-negotiation is tricky. RDFa files don't have any well known file-suffix patterns either. cheers, Dan
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Dan Brickley wrote: +cc: Norm Walsh On 25/6/09 19:39, Juan Sequeda wrote: So... then from what I understand.. why bother with content negotiation, right? Just do everything in RDFa, right? We are planning to deploy soon the linked data version of Turn2Live.com. And we are in the discussion of doing the content negotiation (a la BBC). But if we can KISS, then all we should do is RDFa, right? Does every major RDF toolkit have an integrated RDFa parser already? And yep the conneg idiom isn't mandatory. You can use # URIs, at least if the .rdf application/rdf+xml mime type is set. I believe that's in the Apache defaults now. At least checking here, a fresh Ubuntu installation has application/rdf+xml rdf in /etc/mime.types (a file which I think comes via Apache but not 100% sure). But yes - this is a major problem and headache. Not just around the conneg piece, but in general. I've seen similar results to those reported here with write yourself a FOAF file exercises. Even if people use Leigh Dodd's handy foaf-a-matic webforms to author a file ... at the end of the session they are left with a piece of RDF/XML in their hands, and an instruction to upload it to their sites. Even people with blogs and facebook profiles and twitter accounts etc. can find this daunting. And not many people know what FTP is (or was). My suggestion here is that we look into something like OAuth for delegating permission tokens for uploading files. OAuth is a protocol that uses a Web/HTML for site A to request that some user of site B allow it to perform certain constrained tasks on site B. Canonical example being site A (a printing company) wants to see non-public photos on site B (a photo-sharing site). I believe this model works well for writing/publishing, as well as for mediating information access. If site A is an RDF-generating site, and site B is a generic hosting site, then the idea is that we write or find a generic OAuth-enabled utility that B could use, such that the users of site B could give sites like A permission to publish documents automatically. At a protocol level, I would expect this to use AtomPub but it could also be WebDAV or another mechanism. But how to get all those sites to implement such a thing? Well firstly, this isn't limited to FOAF. Or to any flavour of RDF. I think there is a strong story for why this will happen eventually. Strong because there are clear benefits for many of the actors: * a data-portability and user control story: I don't want all my music profile info to be on last.fm; I want last.fm to maintain http://danbri.org/music for me. * a benefits-the-data source story: I'm sure the marketing teams of various startups would be very happy at the ability to directly push content into 1000s of end-user sites. For the Google/Link karma, traffic etc. * benefits the hosts story: rather than having users share their FTP passwords, they share task-specific tokens that can be managed and rolled back on finer-grained basis So a sample flow might be: 1. User Alice is logged into her blog, which is now AtomPub+OAuth enabled. 2. She clicks on a link somewhere for generate a FOAF file from your music interests, which takes her to a site that asks some basic information (name, homepage) and about some music-related sites she uses. 3. That site's FOAF generator site scans her public last.fm profile (after asking her username), and then does the same for her Myspace and YouTube profiles. 4. It then says OK, generated music profile! May we publish this to your site? It then scans her homesite, blog etc via some auto-discovery protocol(s), to see which of them have a writable AtomPub + OAuth endpoint. It finds her wordpress blog supports this. 5. Alice is bounced to an OAuth permissioning page on her blog, which says something like: The Music Profile site at example.com would like to have read and write permission for an area of your site: once/always/never or for 6 months? 6. Alice gives permission for 6 months. Some computer stuff happens in the background, and the Music site is given a token it can use to post data to Alice's site. 7. http://alice.example.com/blog/musicprofile then becomes a page (or mini-blog or activity stream) maintained entirely, or partially, by the remote site using RDFa markup sent as AtomPub blog entries, or maybe as AtomPub attachments. OK I'm glossing over some details here, such as configuration, choice of URIs etc. I may be over-simplifying some OAuth aspects, and forgetting detail of what's possible. But I think there is real potential in this sort of model, and would like a sanity check on that! Also the detail of whether different sites could/would write to the same space or feed or not. And how we can use this as a page-publishing model instead of a blog entry publishing model. I've written about this before, see
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Thu, Jun 25, 2009 at 6:44 PM, Martin Hepp (UniBW)martin.h...@ebusiness-unibw.org wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Been thinking about this issue for the last 6 months, and ive changed my mind a few times. Inclined to agree that RDFa is probably the ideal entry point for bringing existing businesses onto Good Relations. For a read/write web (which is the goal of commerce, right?), you're probably back to .htaccess, though, with, say, a controller that will manage POSTed SPARUL inserts. I think taking it one step at a time, in this way, seems a sensible approach, though as a community, we'll need to put a bit of wieght behind getting the RDFa tool set up to the state of the art. Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the human element to interpret it. I claim *win* for the Semantic Web, even if it'll still be a few years before we see applications exploiting it in a way that provides real benefit for the end user. my 2 cents. Cheers, Danny. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Toby, Toby A Inkster wrote: On 25 Jun 2009, at 21:18, Pat Hayes wrote: If [RDF] requires people to tinker with files with names starting with a dot [...] then the entire SWeb architecture is fundamentally broken. RDF doesn't. Apache does. Many hosts do have front ends for configuring Apache, allowing redirects to be set up and content-types configured by filling in simple web forms. But there are such a variety of these tools with different capabilities and different interfaces that it would be difficult to produce advice suitable for them all, so instead .htaccess recipes are provided instead. That said, there are a couple of steps that Martin could remove from his recipe and still be promoting reasonably good practice: Step 5a - this rewrites http://example.org/semanticweb to http://example.org/semanticweb.rdf. Other than aesthetics, there's no real reason to do this. Yes, I've read timbl's old Cool URIs document, and understand about not wanting to include hints of file format in a URI. But realistically, this file is going to always include some RDF - perhaps in a non-RDF/XML serialisation, but I don't see anything inappropriate about serving other RDF serialisations using a .rdf URL, provided the correct MIME type is used. Yes - while it breaks my heart, we will uses URIs including the .rdf extension in the future. Comparing benefits and trouble caused, it is not worth pushing it. Step 5b - the default Apache mime.types file knows about application/rdf+xml, so this should be unnecessary. Perhaps instead have a GoodRelations validator which checks that the content type is correct, and only suggests this when it is found to be otherwise. Well, our experience is that about 30% of the servers don't use the proper mime type by default, which causes trouble with many semweb applications Steps 3 and 4 could be amalgamated into a single validate your RDF file step using the aforementioned validator. The validator would be written so that, upon a successful validation, it offers single-click options to ping semweb search engines, and Yahoo (via a RDF/XML-DataRSS converter). With those adjustments, the recipe would just be: 1. Upload your RDF file. 2. Add a rel=meta link to it. 3. Validate using our helpful tool. Yes, that would be a good option. But actually I am prone to go for a more radical shift, which is offering just three alternative publication mechanisms: a) download RDF/XML or N3 file (for expert users) b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. c) have us publish it on our servers (this will require some techniques of validating users, update / refresh - requires some more thoughts. Best Martin -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009 begin:vcard fn:Martin Hepp n:Hepp;Martin org:Bundeswehr University Munich;E-Business and Web Science Research Group adr:;;Werner-Heisenberg-Web 39;Neubiberg;;D-85577;Germany email;internet:mh...@computer.org tel;work:+49 89 6004 4217 tel;pager:skype: mfhepp url:http://www.heppnetz.de version:2.1 end:vcard
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Jun 26, 2009, at 3:03 AM, Toby A Inkster wrote: On 25 Jun 2009, at 21:18, Pat Hayes wrote: If [RDF] requires people to tinker with files with names starting with a dot [...] then the entire SWeb architecture is fundamentally broken. RDF doesn't. Apache does. I should have said, if the process of getting RDF published requires people... Pat Many hosts do have front ends for configuring Apache, allowing redirects to be set up and content-types configured by filling in simple web forms. But there are such a variety of these tools with different capabilities and different interfaces that it would be difficult to produce advice suitable for them all, so instead .htaccess recipes are provided instead. That said, there are a couple of steps that Martin could remove from his recipe and still be promoting reasonably good practice: Step 5a - this rewrites http://example.org/semanticweb to http://example.org/semanticweb.rdf . Other than aesthetics, there's no real reason to do this. Yes, I've read timbl's old Cool URIs document, and understand about not wanting to include hints of file format in a URI. But realistically, this file is going to always include some RDF - perhaps in a non-RDF/ XML serialisation, but I don't see anything inappropriate about serving other RDF serialisations using a .rdf URL, provided the correct MIME type is used. Step 5b - the default Apache mime.types file knows about application/ rdf+xml, so this should be unnecessary. Perhaps instead have a GoodRelations validator which checks that the content type is correct, and only suggests this when it is found to be otherwise. Steps 3 and 4 could be amalgamated into a single validate your RDF file step using the aforementioned validator. The validator would be written so that, upon a successful validation, it offers single- click options to ping semweb search engines, and Yahoo (via a RDF/ XML-DataRSS converter). With those adjustments, the recipe would just be: 1. Upload your RDF file. 2. Add a rel=meta link to it. 3. Validate using our helpful tool. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202 4416 office Pensacola(850)202 4440 fax FL 32502 (850)291 0667 mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Melvin Carvalho wrote: On Thu, Jun 25, 2009 at 6:44 PM, Martin Hepp (UniBW)martin.h...@ebusiness-unibw.org wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Been thinking about this issue for the last 6 months, and ive changed my mind a few times. Inclined to agree that RDFa is probably the ideal entry point for bringing existing businesses onto Good Relations. For a read/write web (which is the goal of commerce, right?), you're probably back to .htaccess, though, with, say, a controller that will manage POSTed SPARUL inserts. I think taking it one step at a time, in this way, seems a sensible approach, though as a community, we'll need to put a bit of wieght behind getting the RDFa tool set up to the state of the art. .htaccess is a sad and unnecessary technical detail that assumes we have an Apache mono-culture, and that said mono-culture is immutable. For GoodRelations based product, services, and offerings descriptions, the workflow should be as follows: 1. Describe you products and services using terms from GR (ontology bound annotators help here irrespective of source and location); 2. Get an HTML as output from #1 (with embedded RDFa for the product and services description data); 3. Optionally, publish doc from #2 to your public Web Server; 4. Optionally, notify the broader Web via pinger services (PTSW, Sindice, etc..). If you couldn't publish docs to your Web Server before you encountered GoodRelations, RDFa, and Linked Data, then we are dealing with a totally different matter, one that isn't specific to Linked Data deployment. Martin: I think having a third party relay inaccurate opening and closing hours is a feature re. the GoodRelations, RDFa, Linked Data, and pinger services combo; it makes the opportunity cost of not putting the RDFa embellished HTML doc (from #3) on the server, palpable :-) Thus, we end up with a closed loop, that simply lets the Web do the REST (including social and political cajoling re. doc publishing). Kingsley Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Martin, b) download RDFa snippet that just represents the RDF/XML content (i.e. such that it does not have to be consolidated with the presentation level part of the Web page. By coincidence, I just read this: Hidden div's -- don't do it! It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Google will not show content from hidden div's in Rich Snippets, and worse, this can be considered cloaking by Google's spam detection systems. [1] Regards, Mark [1] http://knol.google.com/k/google-rich-snippets/google-rich-snippets/32la2chf8l79m/1# -- Mark Birbeck, webBackplane mark.birb...@webbackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Kingsley- On Fri, Jun 26, 2009 at 11:40 AM, Kingsley Idehenkide...@openlinksw.com wrote: Mark: Should we be describing our docs for Google, fundamentally? I really think Google should actually recalibrate back to the Web etc.. The correct question to ask, and the one that I believe Mark is addressing, is should we be asking people to describe their content in a way that may be at cross-purposes to their efforts to monetize it? Bradley P. Allen http://bradleypallen.org +1 310 951 4300
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Martin, On 25.06.2009, at 17:44, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. These are interesting statistics, maybe you want to blog about them or publish them in some other way? The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. For the cases where people still want to serve RDF documents, it would be neat if various CMSes had a simple way of handling content- negotiation. What I'm thinking of is e.g. a module for Drupal which would allow the Drupal admin to specify that, if rdf/xml for node X is requested (a page), serve RDF document Y. The content negotiation would be handled by php code in the module, hence no fiddling with .htaccess required. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. I like it! It's similar to what our Shift tool [2] does for other kinds of data. However, this might lead to other problems: many CMSes only allow a subset of HTML in their input forms, so some of the RDFa could get lost. I remember this was a problem with Blogger in the past (not sure if this problem persists). Cheers, Knud [1] http://kantenwerk.org/shift/ Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the human element to interpret it. I claim *win* for the Semantic Web, even if it'll still be a few years before we see applications exploiting it in a way that provides real benefit for the end user. my 2 cents. Cheers, Danny. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! = = == Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. Pat Hayes Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the human element to interpret it. I claim *win* for the Semantic Web, even if it'll still be a few years before we see applications exploiting it in a way that provides real benefit for the end user. my 2 cents. Cheers, Danny. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! = = == Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
So... then from what I understand.. why bother with content negotiation, right? Just do everything in RDFa, right? We are planning to deploy soon the linked data version of Turn2Live.com. And we are in the discussion of doing the content negotiation (a la BBC). But if we can KISS, then all we should do is RDFa, right? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Thu, Jun 25, 2009 at 7:29 PM, Pat Hayes pha...@ihmc.us wrote: On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. Pat Hayes Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the human element to interpret it. I claim *win* for the Semantic Web, even if it'll still be a few years before we see applications exploiting it in a way that provides real benefit for the end user. my 2 cents. Cheers, Danny. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009:
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Juan Sequeda wrote: So... then from what I understand.. why bother with content negotiation, right? No, it means content negotiation is an option, albeit a tough one when .htaccess and Apache are ground zero. Just do everything in RDFa, right? Of course, if it works for your circumstances :-) Basically, we need to tweak the Linked Data Best Practices guides and general messaging by adding RDFa to the conversation -- as an *option* for Linked Data Deployment. I believe I expressed this sentiment a while back. Kingsley We are planning to deploy soon the linked data version of Turn2Live.com. And we are in the discussion of doing the content negotiation (a la BBC). But if we can KISS, then all we should do is RDFa, right? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Thu, Jun 25, 2009 at 7:29 PM, Pat Hayes pha...@ihmc.us mailto:pha...@ihmc.us wrote: On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. Pat Hayes Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Jeff Finkelstein, Customer Paradigm wrote: Martin- I agree that the .htaccess file is a big stumbling block for many people with low-cost hosting. Would a lightweight php-based application that could write to the .htaccess / create the RDF file work to solve this easily? Sorry, it won't. The issue is actual access to the .htaccess file. Thus, you have to move the metadata expressed in RDF into the (X)HTML docs that are being published based on the existing .htaccess config. Even when the above is done, you will need RDFa processors within user agents (or standalone) for the Linked Data deployment to fully materialize. Kingsley Thanks, -- Jeff Jeff Finkelstein 303.499.9318 x 8282 mailto:j...@customerparadigm.com http://www.customerparadigm.com Customer Paradigm 5353 Manhattan Circle, Suite 103 Boulder, Colorado 80303 Recently Featured Websites: http://www.adventurerabbi.org http://www.boulderjews.org -Original Message- From: semantic-web-requ...@w3.org [mailto:semantic-web-requ...@w3.org] On Behalf Of Martin Hepp (UniBW) Sent: Thursday, June 25, 2009 10:44 AM To: Danny Ayers Cc: bill.robe...@planet.nl; public-lod@w3.org; semantic-web at W3C Subject: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the human element to interpret it. I claim *win* for the Semantic Web, even if it'll still be a few years before we see applications exploiting it in a way that provides real benefit for the end user. my 2 cents. Cheers, Danny. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Giovanni Tummarello wrote: That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Great, why bother with any other solution. even talking about any other solution is extraordinarely bad for the public perception of the semantic web community. Giovanni Giovanni, We don't need mutual exclusivity re. Linked Data Deployment. There's nothing wrong with an array of options that cover a broad range of Linked Data deployment circumstances. HTTP is the essence of the Web (what makes it what it is), and Content Negotiation is intrinsic to HTTP. Don't throw the baby out with the bathwater, really. -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Just confirming. I really want to start getting things done! On Thu, Jun 25, 2009 at 8:08 PM, Kingsley Idehen kide...@openlinksw.comwrote: Juan Sequeda wrote: So... then from what I understand.. why bother with content negotiation, right? No, it means content negotiation is an option, albeit a tough one when .htaccess and Apache are ground zero. Just do everything in RDFa, right? Of course, if it works for your circumstances :-) Basically, we need to tweak the Linked Data Best Practices guides and general messaging by adding RDFa to the conversation -- as an *option* for Linked Data Deployment. I believe I expressed this sentiment a while back. I agree. I think I had this discussion with Peter Mika and Tom Heath before. Don't take me literally but the conclusion was that RDFa is Linked Data once it shows up in the best practices and people know how to do it. but oh my... it's already here: http://ld2sd.deri.org/lod-ng-tutorial/ Thanks Michael and Richard! Kingsley We are planning to deploy soon the linked data version of Turn2Live.com. And we are in the discussion of doing the content negotiation (a la BBC). But if we can KISS, then all we should do is RDFa, right? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org On Thu, Jun 25, 2009 at 7:29 PM, Pat Hayes pha...@ihmc.us mailto: pha...@ihmc.us wrote: On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. Pat Hayes Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
On 25.06.2009, at 19:11, Kingsley Idehen wrote: Jeff Finkelstein, Customer Paradigm wrote: Martin- I agree that the .htaccess file is a big stumbling block for many people with low-cost hosting. Would a lightweight php-based application that could write to the .htaccess / create the RDF file work to solve this easily? Sorry, it won't. The issue is actual access to the .htaccess file. Thus, you have to move the metadata expressed in RDF into the (X)HTML docs that are being published based on the existing .htaccess config. What I meant in my earlier mail is that we can have content negotiation even without manipulating .htaccess (as far as I understand, content negotiation through .htaccess is imperfect anyway). It can be done in code, e.g. in php. The SW Dog Food site [1] uses a third-party php class [2] for this, I think Neologism [3] uses the same. Of course, a solution based on this would still require to be able to upload files to the server. Knud [1] http://data.semanticweb.org [2] http://ptlis.net/source/php-content-negotiation/ [3] http://neologism.deri.ie Even when the above is done, you will need RDFa processors within user agents (or standalone) for the Linked Data deployment to fully materialize. Kingsley Thanks, -- Jeff Jeff Finkelstein 303.499.9318 x 8282 mailto:j...@customerparadigm.com http://www.customerparadigm.com Customer Paradigm 5353 Manhattan Circle, Suite 103 Boulder, Colorado 80303 Recently Featured Websites: http://www.adventurerabbi.org http://www.boulderjews.org -Original Message- From: semantic-web-requ...@w3.org [mailto:semantic-web- requ...@w3.org] On Behalf Of Martin Hepp (UniBW) Sent: Thursday, June 25, 2009 10:44 AM To: Danny Ayers Cc: bill.robe...@planet.nl; public-lod@w3.org; semantic-web at W3C Subject: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta- data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Juan Sequeda wrote: Just confirming. I really want to start getting things done! So get going :-) [SNIP] I agree. I think I had this discussion with Peter Mika and Tom Heath before. Don't take me literally but the conclusion was that RDFa is Linked Data once it shows up in the best practices and people know how to do it. but oh my... it's already here: http://ld2sd.deri.org/lod-ng-tutorial/ Yep, so you found the nugget :-) Kingsley Thanks Michael and Richard! Kingsley We are planning to deploy soon the linked data version of Turn2Live.com. And we are in the discussion of doing the content negotiation (a la BBC). But if we can KISS, then all we should do is RDFa, right? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com http://www.juansequeda.com http://www.juansequeda.com www.semanticwebaustin.org http://www.semanticwebaustin.org http://www.semanticwebaustin.org On Thu, Jun 25, 2009 at 7:29 PM, Pat Hayes pha...@ihmc.us mailto:pha...@ihmc.us mailto:pha...@ihmc.us mailto:pha...@ihmc.us wrote: On Jun 25, 2009, at 11:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. I agree, and raised this issue with the W3C TAG some time ago. It was apparently not taken seriously. The general consensus seemed to be that any normal adult should be competent to manipulate an Apache server. My own company, however, refuses to allow its employees to have access to .htaccess files, and I am therefore quite unable to conform to the current best practice from my own work situation. I believe that this situation is not uncommon. Pat Hayes Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Tom, Is there a place in the ESW wiki where people can find these simple tools/scripts to do the rewriting like the one you did and [1]. I'm sure there must be others. This would be a good resource to have! [1] http://ptlis.net/source/php-content-negotiation/ The easiest pattern I've found is to use a RewriteRule to catch all incoming requests and pass them through a small PHP script that examines the Accept header and sends back 303s (or 200s) as appropriate. The code is about 6 lines; I'll publish it somewhere if I didn't already.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Juan, Not AFAICT, but feel free to search and create one if not. Wherever it lives this would be a great resource to link to from linkeddata.org Cheers, Tom. 2009/6/25 Juan Sequeda juanfeder...@gmail.com: Tom, Is there a place in the ESW wiki where people can find these simple tools/scripts to do the rewriting like the one you did and [1]. I'm sure there must be others. This would be a good resource to have! [1] http://ptlis.net/source/php-content-negotiation/ The easiest pattern I've found is to use a RewriteRule to catch all incoming requests and pass them through a small PHP script that examines the Accept header and sends back 303s (or 200s) as appropriate. The code is about 6 lines; I'll publish it somewhere if I didn't already. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ -- Dr Tom Heath Researcher Platform Division Talis Information Ltd T: 0870 400 5000 W: http://www.talis.com/
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
As mostly, recently ;-), I agree with Kingsley - I did not want to say that proper usage of http is bad or obsolete. But it turned out unfeasible for broad adoption my owners of small Web sites. For huge data sources and for vocabularies, the current recipes are fine. But I want every single business in the world to use GoodRelations for publishing at least their opening hours - 19 Million companies in Europe alone. I cannot explain to every single one of them how to configure their server. Another thing that might have gone lost in the discussion: Even though we knew the recipes, helping the site owners was difficult, because we experienced hundreds of different environments - preexisting .htaccess, MS IIS, hoster-specific scenarios, etc. So the problem is really that such a low-level technique is not feasible if you face so much diversity as far as the target system is concerned. Maybe some day a certain LOD/SW package will be installed by default on most servers. But we cannot wait till then. BTW: We did not even require the full beauty of LOD best practices. We simply want them to do as described here: http://www.ebusiness-unibw.org/wiki/GoodRelations_Recipe_8 Best Martin Kingsley Idehen wrote: Giovanni Tummarello wrote: That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Great, why bother with any other solution. even talking about any other solution is extraordinarely bad for the public perception of the semantic web community. Giovanni Giovanni, We don't need mutual exclusivity re. Linked Data Deployment. There's nothing wrong with an array of options that cover a broad range of Linked Data deployment circumstances. HTTP is the essence of the Web (what makes it what it is), and Content Negotiation is intrinsic to HTTP. Don't throw the baby out with the bathwater, really. -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: mh...@computer.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://tinyurl.com/semtech-hepp Tool for registering your business: http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe Project page and resources for developers: http://purl.org/goodrelations/ Tutorial materials: Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009 begin:vcard fn:Martin Hepp n:Hepp;Martin org:Bundeswehr University Munich;E-Business and Web Science Research Group adr:;;Werner-Heisenberg-Web 39;Neubiberg;;D-85577;Germany email;internet:mh...@computer.org tel;work:+49 89 6004 4217 tel;pager:skype: mfhepp url:http://www.heppnetz.de version:2.1 end:vcard
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi John: We also thought of hosting meta-data for the users, but I don't like that because I want the shop operators to feel ownership for the data: If the opening hours expressed in RDF are wrong but on the personal Web page of that restaurant, anybody facing closed doors will blame the restaurant. If the outdated opening hours in RDF are on my SW server, the unlucky customer will blame the Semantic Web for having crappy data. So maybe the snippet solution in RDFa is the best. Best Martin John Graybeal wrote: This is a principal reason MMI decided to offer a vocabulary server for its community. The idea that 1000 different providers would all develop a level of web competency (for which there is evidence at only a minority of providers) for serving their RDF and OWL content -- let alone the capability to do versioning, adopt best practices, learn SKOS, and whatever other nuances are called for -- seemed like a non-starter. This is not exactly the same problem you're facing, but something to consider (if the model allows it) is creating a way to serve the annotations from another place than the host institution. The institution can refer to those served files from their own sites, and even update them remotely, but not have to incur all the management overhead as standards improve, files change, authorship changes, etc. (Which is not to disagree with your plan either. That sounds fine.) One other delivery model could be for them to give you an existing HTML, you give them back the modified HTML (saves them cutting and pasting steps?). I'm a little ignorant on your tools and processes, so apologies if these are non-starters. John On Jun 25, 2009, at 9:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta-data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for conneg. As far as I'm aware we don't yet have an easy templating engine for RDFa, so I suspect having that as the source is probably a good choice for typical Web applications. As mentioned already GRDDL is available for transforming on the fly, though I'm not sure of the level of client engine support at present. Ditto providing a SPARQL endpoint is another way of maximising the surface area of the data. But the key step has clearly been taken, that decision to publish data directly without needing the human element to interpret it. I claim *win* for the Semantic Web, even if it'll still be a few years before we see applications exploiting it in a way that provides real benefit for the end user. my 2 cents.
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Tom, all, El 25/06/2009, a las 20:30, Tom Heath escribió: Are you referring to the best practices at [1]? Unfortunately the recipes in that document that use .htaccess and mod_rewrite for conneg no longer count as best practices, precisely due to mod_rewrite and .htaccess not being adequate for the conneg/303-redirects pattern. This has been a known issue since WWW2007 at least, and documented at [2] in July 2007. As far as I know, that recipes document hasn't yet been updated/deprecated :( (please someone correct me if I'm wrong). A revision of the Recipes is coming down the pipe. The new version will slightly improve how this issue is tackled, but it won't provide a complete solution. I believe the document is fair wrt this point: it acknowledges the problem and it provides best effort recipes to partially implement conneg using exclusively Apache directives (as opposed to implementing them using PHP, for instance). I would like to emphasize that there are two different issues here: 1) the Recipes do not correctly implement conneg in its full extension and, 2) the Recipes are difficult to deploy due to the inability to access to the .htaccess file. From my POV, both issues are serious, and consequently I also think that RDFa is a much more (1) correct and (2) user-friendly way to publish RDF data. Best, Diego. The easiest pattern I've found is to use a RewriteRule to catch all incoming requests and pass them through a small PHP script that examines the Accept header and sends back 303s (or 200s) as appropriate. The code is about 6 lines; I'll publish it somewhere if I didn't already. Admittedly, this doesn't solve the problem of access to .htaccess files. This bottleneck sounds to me like someone circa mid-1990s saying my sysadmins won't let me have access to space on the web server. I guess we need to use lessons learned from that era to address the problems of this one. Anyway fancy doing a Linked Data for Sysadmins tutorial at a sysadmin conference? Cheers, Tom. [1] http://www.w3.org/TR/swbp-vocab-pub/ [2] http://lists.w3.org/Archives/Public/public-swbp-wg/2007Jul/0001.html -- Diego Berrueta RD Department - CTIC Foundation E-mail: diego.berru...@fundacionctic.org Phone: +34 984 29 12 12 Parque Científico Tecnológico Gijón-Asturias-Spain www.fundacionctic.org
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Martin Hepp (UniBW) wrote: As mostly, recently ;-), I agree with Kingsley - I did not want to say that proper usage of http is bad or obsolete. But it turned out unfeasible for broad adoption my owners of small Web sites. For huge data sources and for vocabularies, the current recipes are fine. But I want every single business in the world to use GoodRelations for publishing at least their opening hours - 19 Million companies in Europe alone. I cannot explain to every single one of them how to configure their server. Another thing that might have gone lost in the discussion: Even though we knew the recipes, helping the site owners was difficult, because we experienced hundreds of different environments - preexisting .htaccess, MS IIS, hoster-specific scenarios, etc. So the problem is really that such a low-level technique is not feasible if you face so much diversity as far as the target system is concerned. Maybe some day a certain LOD/SW package will be installed by default on most servers. But we cannot wait till then. BTW: We did not even require the full beauty of LOD best practices. We simply want them to do as described here: http://www.ebusiness-unibw.org/wiki/GoodRelations_Recipe_8 Best Martin Martin, So for this particular Linked Data deployment and consumption scenario, we are going to end up with the following: Content creators: Describe your stuff (you, your offerings, your needs etc.) using RDFa within (X)HTML. Use your current publishing worflow to publish your RDFa embellished docs. User Agents (Browsers, Crawlers, etc.): Get RDFa enabled (i.e., become an RDFa processor) so you can do something with Linked Data for Web users that enriches their overall experience courtesy of #1 above. Re. Virtuoso, you've always been able to SPARQL against any (X)HTML resource using Virtuoso's SPARQL processor (which leverages the Sponger Middleware its RDFa Cartridge); simply use the RDFa embellished document's URL as the Named Graph IRI in your SPARQL query. Examples: 1. http://linkeddata.uriburner.com/sparql 2. http://lod2.openlinksw.com/sparql 3. http://bbc.openlinksw.com/sparql 4. Any other Virtuoso instance that enables the Sponging. Kingsley Kingsley Idehen wrote: Giovanni Tummarello wrote: That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Great, why bother with any other solution. even talking about any other solution is extraordinarely bad for the public perception of the semantic web community. Giovanni Giovanni, We don't need mutual exclusivity re. Linked Data Deployment. There's nothing wrong with an array of options that cover a broad range of Linked Data deployment circumstances. HTTP is the essence of the Web (what makes it what it is), and Content Negotiation is intrinsic to HTTP. Don't throw the baby out with the bathwater, really. -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Jeremy/Pat, On Thu, Jun 25, 2009 at 10:16 PM, Jeremy Carrolljer...@topquadrant.com wrote: Pat Hayes wrote: RDF should be text, in documents. One should be able to use it without knowing about anything more than the RDF spec and the XML spec. If it requires people to tinker with files with names starting with a dot, or write code, or deploy scripts, then the entire SWeb architecture is fundamentally broken. Largely agreeing with you Pat, I think I would want to go a step further and say that you should be able to use RDF without knowing anything about the RDF spec or the XML spec, or any other spec. Web users are not required to read the specs. Using RDF includes publishing it. The infrastructure whatever that is should achieve the ability to publish my data in an appropriate way. I guess what you're getting at here is the more general point about tools hiding RDF, but I'd like to add some comments on HTML more specifically. HTML publishing really took off when it became easy for anyone to publish. Originally that was word processors that converted their output to HTML, but you still had to deploy your document to a server. Then it was CMS systems that could be used as easily by small businesses and schools as large corporations, but you still needed to have a server. Then it became blogs, where someone else did the server install, and you just typed in the content. Then it was wikis -- ditto. Now it's Google Docs, Facebook pages, Tweets, and more. In other words, publishing HTML just gets easier and easier, and that's the infrastructure that's important -- the HTML publishing infrastructure. So to publish RDF, we should simply be leveraging that enormous infrastructure. This theme was one of the major motivations for the creation of RDFa (née RDF/XHTML), and I would say that it's an even more important theme today; so much so that I made it the core of a presentation I did at SemTech last week, on 'RDFa: The Semantic Web's Missing Link'. [1].) Regards, Mark [1] http://webbackplane.com/mark-birbeck/blog/2009/06/slides-for-semtech2009-talk-on-rdfa -- Mark Birbeck, webBackplane mark.birb...@webbackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Hi Kingsley, If you are comfortable producing (X)HTML documents, then simply use RDFa and terms from relevant vocabularies to describe yourself, your needs, your offerings, and other things, clearly. Once you've done that, simply leave the Web to do the REST :-) Everything else is a technical detail (imho). But that isn't the discussion we're having, IMHO. We're not talking about how you or I might do it -- people comfortable with .htaccess files, server configuration, and so on. My understanding of the discussion that was going on, is that whilst we all want to see the semantic web succeed (even if we all have a different view of what the semantic web is), we're asking how exactly it is that we can achieve it. And for years, the solutions proposed have been somewhat mysterious; RDF/XML, SPARQL end-points, N3, content negotiation, 303s, and so on. You have to ask yourself at some point, do we want the data, or don't we -- do we want people to publish stuff that we 'semwebbers' can use? And if we do want it, then let's help them publish it. I may be biased because I've had my nose pressed up against it for too many years, but I believe that in this regards, RDFa is a game-changer. It's not GRDDL, which says 'publish whatever the hell you like and we'll convert it'. It's not microformats, which says, 'here are a handful of centralised vocabularies, for use on a decentralised web'. And it's not RDF/XML, which requires you to take apart your server and put it back together again. It's HTML. And everyone knows at least one way to publish HTML, don't they? In the years that I've been involved with the RDFa work, the mental model I have always had, is of someone using Blogger or Drupal or something just as simple, to publish RDF. That's now possible with RDFa, and what's even more exciting, Yahoo! and Google will pick it up. I realise I'm sounding like an evangelist (no doubt because I am one). :) But my suggestion would be that we have a window of opportunity here, to create a semantic infrastructure that is indistinguishable from the web itself; the more metadata we can get into HTML-space, the more likely we are to bring about a more 'semantic' web...before anyone notices. ;) Regards, Mark -- Mark Birbeck, webBackplane mark.birb...@webbackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation
Just because it's on your server doesn't mean the visitor to the restaurant's web page has to know that. (Does it?) Hmm, maybe that takes us back to the .htaccess argument I agree the shop owner has to feel ownership. So whatever solution you choose, the shop owner has to have access to the tool which enables its easy use, in their language and context. I mention this because I don't know if the snippet solution will pass that test. It will be cool if it does. (Please let us know how it turns out, you are the cutting edge research here! I find what you are doing very exciting.) John On Jun 25, 2009, at 12:26 PM, Martin Hepp (UniBW) wrote: Hi John: We also thought of hosting meta-data for the users, but I don't like that because I want the shop operators to feel ownership for the data: If the opening hours expressed in RDF are wrong but on the personal Web page of that restaurant, anybody facing closed doors will blame the restaurant. If the outdated opening hours in RDF are on my SW server, the unlucky customer will blame the Semantic Web for having crappy data. So maybe the snippet solution in RDFa is the best. Best Martin John Graybeal wrote: This is a principal reason MMI decided to offer a vocabulary server for its community. The idea that 1000 different providers would all develop a level of web competency (for which there is evidence at only a minority of providers) for serving their RDF and OWL content -- let alone the capability to do versioning, adopt best practices, learn SKOS, and whatever other nuances are called for -- seemed like a non-starter. This is not exactly the same problem you're facing, but something to consider (if the model allows it) is creating a way to serve the annotations from another place than the host institution. The institution can refer to those served files from their own sites, and even update them remotely, but not have to incur all the management overhead as standards improve, files change, authorship changes, etc. (Which is not to disagree with your plan either. That sounds fine.) One other delivery model could be for them to give you an existing HTML, you give them back the modified HTML (saves them cutting and pasting steps?). I'm a little ignorant on your tools and processes, so apologies if these are non-starters. John On Jun 25, 2009, at 9:44 AM, Martin Hepp (UniBW) wrote: Hi all: After about two months of helping people generate RDF/XML metadata for their businesses using the GoodRelations annotator [1], I have quite some evidence that the current best practices of using .htaccess are a MAJOR bottleneck for the adoption of Semantic Web technology. Just some data: - We have several hundred entries in the annotator log - most people spend 10 or more minutes to create a reasonable description of themselves. - Even though they all operate some sort of Web sites, less than 30 % of them manage to upload/publish a single *.rdf file in their root directory. - Of those 30%, only a fraction manage to set up content negotiation properly, even though we provide a step-by-step recipe. The effects are - URIs that are not dereferencable, - incorrect media types and and other problems. When investigating the causes and trying to help people, we encountered a variety of configurations and causes that we did not expect. It turned out that helping people just managing this tiny step of publishing Semantic Web data would turn into a full-time job for 1 - 2 administrators. Typical causes of problems are - Lack of privileges for .htaccess (many cheap hosting packages give limited or no access to .htaccess) - Users without Unix background had trouble name a file so that it begins with a dot - Microsoft IIS require completely different recipes - Many users have access just at a CMS level Bottomline: - For researchers in the field, it is a doable task to set up an Apache server so that it serves RDF content according to current best practices. - For most people out there in reality, this is regularly a prohibitively difficult task, both because of a lack of skills and a variety in the technical environments that turns into an engineering challenge what is easy on the textbook-level. As a consequence, we will modify our tool so that it generates dummy RDFa code with span/div that *just* represents the meta- data without interfering with the presentation layer. That can then be inserted as code snippets via copy-and-paste to any XHTML document. Any opinions? Best Martin [1] http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ Danny Ayers wrote: Thank you for the excellent questions, Bill. Right now IMHO the best bet is probably just to pick whichever format you are most comfortable with (yup it depends) and use that as the single source, transforming perhaps with scripts to generate the alternate representations for