Alternatively you could put that data in a RDF store, and just serve up the fragments using a wrapped CONSTRUCT query.

That's what we do for qdos.com, eg
  http://qdos.com/user/Steve-Harris/18b6f60b41e05aaa418565ebfe901d6b/rdfxml
and it's pretty efficient, more efficient that storing 1000 separate files as XML.

The downside is that the RDF is not very pretty to look at, but it could be with a better RDF/XML serialiser.

- Steve

On 20 May 2009, at 14:59, Martin Hepp (UniBW) wrote:

Hi Steve,
as I replied to Libby (but did not include all mailing lists): The whole data set is served from currently 100 smaller files, which will be broken down to 1000 files shortly. For various reasons however, we don't want to serve one file per element, because that will create a huge overhead - the individual data sets are rather small (a few triples per item). Having one million micro-files is hard to manage. Also, since we want to stay within OWL DL, we would have to duplicate proper ontology header meta-data a million times.

Thus, we use a (rather large) set of rules in the .htaccess file to serve that part of the data set that contains the element you are actually looking for. You will receive a few more triples than you need, but simply discard those ;-)

Martin

Steve Harris wrote:
Very cool resource.

On 20 May 2009, at 10:18, Libby Miller wrote:
Individual commodity descriptions can be retrieved as follows:

http://openean.kaufkauf.net/id/EanUpc_<UPC/EAN>

Example:

http://openean.kaufkauf.net/id/EanUpc_0001067792600

This seems to give me multiple product descriptions - am I misunderstanding?

Yeah, looks like it returns the entire document that the particular EAN appears in.

Not very linked data friendly (you'll end up with a large proportion of repeated triples in identical graphs, with different graph URIS), but certainly better than nothing.

- Steve


--
--------------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail: mh...@computer.org
phone:  +49-(0)89-6004-4217
fax:    +49-(0)89-6004-4620
www:    http://www.unibw.de/ebusiness/ (group)
        http://www.heppnetz.de/ (personal)
skype:  mfhepp

Check out the GoodRelations vocabulary for E-Commerce on the Web of Data! = = ======================================================================

Webcast explaining the Web of Data for E-Commerce:
-------------------------------------------------
http://www.heppnetz.de/projects/goodrelations/webcast/

Tool for registering your business:
----------------------------------
http://www.ebusiness-unibw.org/tools/goodrelations-annotator/

Overview article on Semantic Universe:
-------------------------------------
http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html

Project page and resources for developers:
-----------------------------------------
http://purl.org/goodrelations/

Upcoming events:
---------------
Full-day tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey

http://www.eswc2009.org/program-menu/tutorials/70

Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology

http://www.semantic-conference.com/session/1881/

<martin_hepp.vcf>

--
Steve Harris
Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD


Reply via email to