I guess I asked the question wrong - the linked open data project currently identifies a specific set of dat resources that are linked together - so thie "entity" is definable - I didn't mean to ask how big the whole Semantic Web is - I meant how many triples are in this particular group - the set that are described on http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData I've been able to download pictures of this graph every few months or so, and you can see the number of datasets growing, but the last published number of triples for the thing (as stated on that page) is from over a year ago, and a whole bunch of stuff has been added and some of these have grown a lot - so we have a publicly shared, large- scale, RDF data resource that can be used for benchmarking, trying different interfaces and new technologies, etc So it would be really nice to get a number every now and then so we could plot growth, explain to people what is in it better, etc. I know, I know, I know all the technical reasons this is relatively meaningless, but I gotta tell you, when I hear someone say "20 billion triples," I can tell you it it causes people to pay attention -- problem is I would like to use a number that has some validity before I start quoting it....

On Nov 20, 2008, at 5:12 AM, Michael Hausenblas wrote:

My 2c in order to capture this for others as well:

http://community.linkeddata.org/MediaWiki/index.php?HowBigIsTheDangedThing

Cheers,
        Michael

----------------------------------------------------------
Dr. Michael Hausenblas
DERI - Digital Enterprise Research Institute
National University of Ireland, Lower Dangan,
Galway, Ireland
----------------------------------------------------------

Jim Hendler wrote:
So I've been to a number of talks lately where the size of the current (Sept 08 diagram) Linked Open Data cloud, in triples, has been stated - with numbers that vary quite widely. The esw wiki says 2B triples as of 2007, which isn't very useful given the growth we've seen in the past year -- I've also seen the various blog posts and mail threads saying why we shouldn't cit meaningless numbers and such - but frankly, I've recently been on a bunch of panels with DB guys, and I'd love to have a reasonable number to quote -- anyone have a good estimate of the size of the danged thing (number of triples in the whole as an RDF graph would be nice) -- would also be nice for general audiences where big numbers tend to impress and for research purposes (for example, we know how far we can compress the triples for an in memory approach we are playing with, but we want to figure out how much memory we need for the whole cloud - we want to know if we need to shell out for the 16G iphone) anyway, if anyone has a decent estimate, or even a smart educated guess, I'd love to hear it
JH
"If we knew what we were doing, it wouldn't be called research, would it?." - Albert Einstein
Prof James Hendler                http://www.cs.rpi.edu/~hendler
Tetherless World Constellation Chair
Computer Science Dept
Rensselaer Polytechnic Institute, Troy NY 12180

"If we knew what we were doing, it wouldn't be called research, would it?." - Albert Einstein

Prof James Hendler                              http://www.cs.rpi.edu/~hendler
Tetherless World Constellation Chair
Computer Science Dept
Rensselaer Polytechnic Institute, Troy NY 12180





Reply via email to