Re: New LOD Cloud - Please send us links to missing data sources

2009-03-05 Thread Anja Jentzsch

Hi all,

thanks for all your input.

The LOD Cloud as of March 2009 is final and online.

You can find it over at 
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData 
along with a colored by topic version and various formats.


I will update the dataset table and put a linkage table on the dataset 
page later today. It would be extremely useful keeping these tables up 
to date.


Anja

Anja Jentzsch schrieb:

Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png


We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:

- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is interlinked 
with at least one other source (meaning it references URIs within the 
namespace of the other source).


Anja





Re: AW: New LOD Cloud - Please send us links to missing data sources

2009-03-03 Thread mis...@garlik

Hello,

Having just seen the bigger and better LOD cloud [1]. I was wondering  
whether QDOS should have a link to DBpedia, as we emit owl:sameAs  
relationships from celebs to dbpedia resources: see Barak Obama's  
turtle file [2]. QDOS currently has around 100,000 people, about half  
of which are linked to dbpedia resources.


Cheers,

Mischa

[1] http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png
[2] http://qdos.com/celeb/Barack-Obama/6050a1f1a818bf959f70afbd46273dea/turtle

On 28 Feb 2009, at 09:03, Chris Bizer wrote:


Hi Kingsley,


You have MySpace and Flickr Wrappers but still don't include all the
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?


The last time I checked the cartridges, I had the impression that  
they were
not very much interlinked with the rest of the LOD cloud and about  
half of

them were down.

The links to DBpedia were rather strange. For instance the first  
link I
found owl:sameAs'ed Yahoo finance balance sheet with Dbpedia bed  
sheet,
which is even for me as a big supporter of owl:sameAs links a bit  
too much

Semantic gap.

Did you improve the quality of the external links in the meantime  
and do the

cartridges regularly deliver data?

Another problem is that many of the sources are not really Open Data  
as

various license restrictions apply.

Cheers

Chris


-Ursprüngliche Nachricht-
Von: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] Im  
Auftrag

von Kingsley Idehen
Gesendet: Samstag, 28. Februar 2009 00:18
An: Anja Jentzsch
Cc: public-lod@w3.org
Betreff: Re: New LOD Cloud - Please send us links to missing data  
sources


Anja Jentzsch wrote:

Hi all,

we are currently updating the LOD cloud. Find the draft here:
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png

We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are
missing for the following sources:
- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is
available via dereferencable URIs and if the data source is
interlinked with at least one other source (meaning it references  
URIs

within the namespace of the other source).

Anja



Anja,

You have MySpace and Flickr Wrappers but still don't include all the
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?

Also, the LODD data sets page should be linked to:
http://esw.w3.org/topic/DataSetRDFDumps, so we can track down the  
dumps

with ease re. the Virtuoso LOD hosting instance.

links:

1. http://virtuoso.openlinksw.com/images/sponger-cloud.html

--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO
OpenLink Software Web: http://www.openlinksw.com









___
Mischa Tuffield
Email: mischa.tuffi...@garlik.com
Homepage - http://mmt.me.uk/
FOAF URI - http://mmt.me.uk/foaf.rdf#mischa



PGP.sig
Description: This is a digitally signed message part


RE: AW: New LOD Cloud - Please send us links to missing data sources

2009-03-03 Thread John Goodwin

I think you should be - esp. as you also connect to the BBC linked data with 
stuff like:

lastfm:favouriteArtist 
rdf:resource=http://www.bbc.co.uk/music/artists/e60e1f0b-1e8c-45e7-9d4a-222db9cb34f7#artist/

in the FOAF profiles you generate.

John

-Original Message-
From: public-lod-requ...@w3.org on behalf of mis...@garlik
Sent: Tue 03/03/2009 10:53
To: public-lod@w3.org
Subject: Re: AW: New LOD Cloud - Please send us links to missing data sources
 
Hello,

Having just seen the bigger and better LOD cloud [1]. I was wondering  
whether QDOS should have a link to DBpedia, as we emit owl:sameAs  
relationships from celebs to dbpedia resources: see Barak Obama's  
turtle file [2]. QDOS currently has around 100,000 people, about half  
of which are linked to dbpedia resources.

Cheers,

Mischa

[1] http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png
[2] http://qdos.com/celeb/Barack-Obama/6050a1f1a818bf959f70afbd46273dea/turtle

On 28 Feb 2009, at 09:03, Chris Bizer wrote:

 Hi Kingsley,

 You have MySpace and Flickr Wrappers but still don't include all the
 Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?

 The last time I checked the cartridges, I had the impression that  
 they were
 not very much interlinked with the rest of the LOD cloud and about  
 half of
 them were down.

 The links to DBpedia were rather strange. For instance the first  
 link I
 found owl:sameAs'ed Yahoo finance balance sheet with Dbpedia bed  
 sheet,
 which is even for me as a big supporter of owl:sameAs links a bit  
 too much
 Semantic gap.

 Did you improve the quality of the external links in the meantime  
 and do the
 cartridges regularly deliver data?

 Another problem is that many of the sources are not really Open Data  
 as
 various license restrictions apply.

 Cheers

 Chris


 -Ursprüngliche Nachricht-
 Von: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] Im  
 Auftrag
 von Kingsley Idehen
 Gesendet: Samstag, 28. Februar 2009 00:18
 An: Anja Jentzsch
 Cc: public-lod@w3.org
 Betreff: Re: New LOD Cloud - Please send us links to missing data  
 sources

 Anja Jentzsch wrote:
 Hi all,

 we are currently updating the LOD cloud. Find the draft here:
 http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png

 We have already added:

 1. the RKBExplorer cloud
 2. the Bio2RDF cloud
 3. the LODD cloud
 4. GeoSpecies
 5. LIBRIS

 Statistics on triple and link count (as well as target sources) are
 missing for the following sources:
 - Faviki
 - RDFohloh
 - OpenCalais
 - LIBRIS

 Did we forget any new data sources or links between data sources?

 Keep in mind: A data source qualifies for the cloud, if the data is
 available via dereferencable URIs and if the data source is
 interlinked with at least one other source (meaning it references  
 URIs
 within the namespace of the other source).

 Anja


 Anja,

 You have MySpace and Flickr Wrappers but still don't include all the
 Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?

 Also, the LODD data sets page should be linked to:
 http://esw.w3.org/topic/DataSetRDFDumps, so we can track down the  
 dumps
 with ease re. the Virtuoso LOD hosting instance.

 links:

 1. http://virtuoso.openlinksw.com/images/sponger-cloud.html

 -- 


 Regards,

 Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
 President  CEO
 OpenLink Software Web: http://www.openlinksw.com








___
Mischa Tuffield
Email: mischa.tuffi...@garlik.com
Homepage - http://mmt.me.uk/
FOAF URI - http://mmt.me.uk/foaf.rdf#mischa


.


This email is only intended for the person to whom it is addressed and may 
contain confidential information. If you have received this email in error, 
please notify the sender and delete this email which must not be copied, 
distributed or disclosed to any other person.

Unless stated otherwise, the contents of this email are personal to the writer 
and do not represent the official view of Ordnance Survey. Nor can any contract 
be formed on Ordnance Survey's behalf via email. We reserve the right to 
monitor emails and attachments without prior notice.

Thank you for your cooperation.

Ordnance Survey
Romsey Road
Southampton SO16 4GU
Tel: 08456 050505
http://www.ordnancesurvey.co.uk




Re: AW: New LOD Cloud - Please send us links to missing data sources

2009-03-03 Thread Marko A. Rodriguez
Hi everyone,

I was looking at the linked data cloud visualization the other day when
Chris Bizer posted it to this mailing list - http://tinyurl.com/b4vfbq . I
started to gather all the structural communities that I found by eye into
different sets. If you are interested, here is a blog entry containing the
sets:

http://tinyurl.com/cr3xj5

Then I decided to do a graph analysis computationally. The graph analysis
article can be found here:

http://arxiv.org/abs/0903.0194

Someone on this mailing list mentioned how difficult it is to decipher the
connectivity of the PNG visualization. For me, it was very tedious and
painstaking to turn the visualization into a graph data structure. My
attempt at completeness is visualized on page 7 of the article.

Take care,
Marko A. Rodriguez

http://markorodriguez.com




Analyzing the success of LOD (was: New LOD Cloud - Please send us links to missing data sources)

2009-03-02 Thread Matthias Samwald


Andraz:

That the bubbles continue to grown is however a sociological
interesting phenomen :-)
And a good sign that something has gone right :)


Giovanni:

Maybe :-) but people do things for many other reason that they're right.


I think the LOD project is a great success. It is a very lively community, 
there has been significant progress over the last year (amount of data, 
quality of underlying technologies such as Virtuoso). However, the community 
should take some time to analyze WHY it is successful, and why it is more 
successful than attempts of using RDF/OWL before 2007. Some thoughts on 
this:


* The main ingredient to the success of LOD is that it is relatively 
centralized. It would not work without DBpedia serving as the 'nucleus' of 
the cloud. It would not work without someone dedicated to drawing the clould 
diagram that everyone is happy to show on Powerpoint slides. It would not 
work without this mailing list that serves an open platform for the 
community. However, I have the impression that some key persons in the LOD 
community might not be happy about this reason for success at all. For them, 
the LOD project is a mere testing ground for the next generation of the 
entire web, and showing that linked data works in a decentralized way is a 
crucial aspect of this vision. The fact that the current LOD cloud was 
actually produced in a rather centralized process, and that most of the 
valuable data sources in the LOD cloud are actually under the control of a 
very small number of stakeholders, is seen as a transient blemish, at best.
However, I think that this is a problematic situation, and we should embrace 
the semi-centralized nature of the LOD project, rather than hiding it away. 
Having a close-knit group of stakeholders that contribute to a partly 
distributed, partly centralized knowledge base might actually be a very 
interesting endeavor -- and it might be a way to provide a clear incentive 
to participate. LOD could be a novel type of open-source project, one that 
is not only concerned with code, but also with the underlying data. The 
products of this open source project could then be used in various kinds of 
projects, some of them with commercial focus. In such a scenario, being the 
main stakeholder for a certain subset of LOD might become profitable, and 
give incentive to improve the data provided and controlled by each 
stakeholder. This business model could be similar to that of successful open 
source content management systems such as Typo3 or Drupal, where the code is 
free, but providing consulting and customization for certain commercial 
users is based on financial support.
I know that this idea of a 'LOD brand' counters the main motivation of most 
people in the community, but it might be the key to creating an incentive 
structure for providing linked data, improving data quality and actually 
getting people to use the data. With the current philosophy, I see the 
danger of LOD staying a permanent 'proof of concept'. The concept has been 
proved by now.


* A good point by Giovanni is that mere interlinking of datasets was 
possible since 1999 by re-using URIs, and that post-hoc mapping between 
datasets was possible since 2004, when owl:sameAs was invented. The linked 
data movement 'only' added the consensus that HTTP URIs should be used, and 
that a HTTP GET request should yield a small RDF subgraph, listing the RDF 
triples about the resource. Surely, this is a very practical thing for many 
reasons, but was it instrumental for the success of LOD? At the moment, it 
seems that most *useful* applications of LOD data are based on a central 
triple store created by the aggregation of some or all LOD data sources. In 
that case, one might ask whether the dereferenceable URIs are really an 
essential ingredient to the success or LOD, or just a 'good to have', but 
not essential, feature.



Giovanni:

An alternative explanation i like is
http://inamidst.com/whits/2008/technobunkum


This is the second time I see this link on this mailing list. He makes some 
very good points about the importance of focusing on providing solutions to 
problems, instead of becoming too tangled up in technicalities. I also read 
his other text on http://inamidst.com/whits/2008/ambient which gives a lot 
of insight into why he has abandoned Semantic Web technologies. I guess the 
problems he likes to see solved are too trivial to require a paradigmatic 
change (such as a  global trend towards RDF/OWL and linked data). However, I 
would not generalize this experience to yield the conclusion that the 
Semantic Web is a huge case of 'Technobunkum' (what a silly term, by the 
way). The fact that not every tiny little problem on the web might be in 
need of Semantic Web technologies does not mean that these technologies are 
worthless. There are plenty of real use cases in important business segments 
and companies where there is dire need for such new technologies -- life 

Re: New LOD Cloud - Please send us links to missing data sources

2009-03-02 Thread Kingsley Idehen

Andraz Tori wrote:

On Mon, 2009-03-02 at 01:54 +0100, Giovanni Tummarello wrote:
  

Hi Andreaz :-)

I don't see the difference between the LOD model and the data
(including
links) itself. At least to us at Zemanta it is immensely
helpful to have
a lot of those links done. It brings down the cost of doing
really
innovative stuff to us and I believe to many others too. 

We don't dereference them in real-time, but ahead-of-time to

produce
specialized datasets, but I don't think that makes a
difference.



to the risk of being pedantic..

Well the idea from the day 0 of the semantic web was that entity would
be interlinked by the reuse of the same URI. linking or putting a
same as is the same thing so nothing new.

the new part of LOD would be the publishing of a dataset as many
individual RDF description corresponding to the resolution of each
URI/URL. and that part is not being useful to you 



I wasn't here at its conception, so I can only talk about current
situation. Currently this is only a part of a bargain, but not the whole
thing. The other part is actually having some concrete set of federated
generally-interesting datasets that are interconnected instead of
abstract vision or one-off efforts.

So LOD is actually people working 'together' to get something practical
out of larger SW idea.

I also think that LOD has started to cause the Network effect. Every new
dataset makes others more usable.

  

 That the bubbles continue to grown is however a sociological
 interesting phenomen :-)


And a good sign that something has gone right :)



Maybe :-) but people do things for many other reason that they're
right. An alternative explanation i like is
http://inamidst.com/whits/2008/technobunkum 



I disagree LOD is this kind of a beast. Services like ours are proving
LOD is important for 'mainstream'. While it is a stretch to say our
service is (already) mainstream or that LOD is the main enabler, it is
definitely meant for mainstream and LOD plays a certain part in making
it achievable (cheaply enough).


  

I think LOD achievement is enormous and this is only the
start.
:)

Said picture might have helped to get a lot of  RDF data online. This
is undoubtly a great achievement
But sustenability and real growth comes only if we can prove real
reasons for people to publish this data, and in this way. While we
havent yet seen this, this doesnt mean that some application might not
exist



we're on exactly the same page here! Sustainability and growth depend on
working incentives for publishers to publish the data. And there is just
one thing (commercial) publishers care about - direct and indirect
traffic. 


If a good incentive is getting more links from bloggers (which in turn
brings traffic and increases different rankings), there you have one in
Zemanta. Ok, actually it is only a potential, since we include datasets
on one-by-one basis. But when a company approaches us to incorporate
their links into our suggestion pool the first question we ask is: Do
you have proper connections into LOD? It makes everything so much
easier.

To date bloggers have created more than a million permanent hyperlinks
with Zemanta. For some links the ability to suggest them came from the
fact that the LOD data and links were available.

I am sure there must be some other LOD related services that in the 'end
consequence' bring traffic to the publishers. We are working on at least
another one - Simple Semantic Tagging.

bye
andraz
  

Andraz,

I agree with your points and general sentiment.

One issue we need to address is the LOD cloud. Personally, there are 
vital aspects of the big picture that is completely misses.


1. UMBEL - this is the data dictionary aspect of the Linked Data Web and 
it meshes disparate ontologies (one aspect) and also provides a concept 
scheme (its other aspect)
2. Dynamic Linked Data - our sponger cartridges perform lookups and 
joins against other data providers (as you can see re. Zemanta cartridge) .


When you combine the main cloud, #1, and #2, you basically moot any 
questions about the state of the Linked Data Web. Instead, we move 
over to more interesting and important issues such as quality of data 
within the Linked Data Web.


Quality is valuable and an opportunity for any Linked Data Web player 
to innovate.


Show me an entrepreneur and I will show you someone who is knowingly or 
unknowingly using DBMS technology to provide unique value to his/her 
customer base via queries, lookups, and joins :-)


Kingsley
  

Giovanni









--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








FW: New LOD Cloud - Please send us links to missing data sources

2009-03-01 Thread John Goodwin

 

Some missing links... My pub guide (which I still need to tidy up) is linked to 
revyu and vice versa (ok it is only 2 links). The pub guide is currently linked 
to the OS Admin Geography data in RKBEXplorer and not geonames (geonames links 
may be forthcoming). I also have links from my FOAF profiles to OS Admin 
Geography in RKB Explorer and to the BBC music linked data.

cheers

John

-Original Message-
From: public-lod-requ...@w3.org on behalf of Anja Jentzsch
Sent: Fri 27/02/2009 22:58
To: public-lod@w3.org
Subject: New LOD Cloud - Please send us links to missing data sources
 
Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png

We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:
- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is interlinked 
with at least one other source (meaning it references URIs within the 
namespace of the other source).

Anja



.


This email is only intended for the person to whom it is addressed and may 
contain confidential information. If you have received this email in error, 
please notify the sender and delete this email which must not be copied, 
distributed or disclosed to any other person.

Unless stated otherwise, the contents of this email are personal to the writer 
and do not represent the official view of Ordnance Survey. Nor can any contract 
be formed on Ordnance Survey's behalf via email. We reserve the right to 
monitor emails and attachments without prior notice.

Thank you for your cooperation.

Ordnance Survey
Romsey Road
Southampton SO16 4GU
Tel: 08456 050505
http://www.ordnancesurvey.co.uk




Re: New LOD Cloud - Please send us links to missing data sources

2009-03-01 Thread Kingsley Idehen

Hugh Glaser wrote:

I take you point, Giovanni, but I have to say it:
http://www.rkbexplorer.com/
uses a good 30 different bubbles on the (new) diagram, without collecting them 
into a single store, and using the URI linkage.
From the LOD publicity point of view, it is unfortunate that you can't tell (I 
hope).
But that was one of the objectives:- a user of LOD should be able to be 
blissfully unaware that they are using LOD.
  
Vital point: Users should be blissfully unaware that they are using LOD. 
They should simply feel the FORCE [1] :-)



Links:

1. 
http://www.openlinksw.com/dataspace/kide...@openlinksw.com/weblog/kide...@openlinksw.com%27s%20blog%20%5b127%5d/1474 
-- post about the FORCE .


Kingsley

Cheers
Hugh

On 01/03/2009 00:30, Giovanni Tummarello g.tummare...@gmail.com wrote:


congrats and kudos to all those who've made this happen. I think the cloud 
diagrams are proving a very compelling visual for people who don't care about 
nerdy detail but understand the idea of interlinked datasets.


Yes they're great for handwaving if the audience has never seen it, otherwise 
its likely counterproductive

The problem is that LOD has been stuck here 2 years really now, not a single 
advance not a single application (of the LOD model, not of the data, the data 
is obviously useful and expressing in RDF is also starting to be seen as 
useful) .

That the bubbles continue to grown is however a sociological interesting 
phenomen :-)

On the positive side,  i recently reviewed some work by someone who has a very 
interesting way to create a diagram which actually helps by showing which queries can be 
asked.  Too bad you wont see it in action at ESWC because the demo paper was  not 
up to the springer standards for legibility, according to some other reviewer.

Giovanni



  



--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-03-01 Thread Giovanni Tummarello
Hi Andreaz :-)

I don't see the difference between the LOD model and the data (including
 links) itself. At least to us at Zemanta it is immensely helpful to have
 a lot of those links done. It brings down the cost of doing really
 innovative stuff to us and I believe to many others too.


 We don't dereference them in real-time, but ahead-of-time to produce
 specialized datasets, but I don't think that makes a difference.


to the risk of being pedantic..

Well the idea from the day 0 of the semantic web was that entity would be
interlinked by the reuse of the same URI. linking or putting a same as
is the same thing so nothing new.

the new part of LOD would be the publishing of a dataset as many individual
RDF description corresponding to the resolution of each URI/URL. and that
part is not being useful to you



  That the bubbles continue to grown is however a sociological
  interesting phenomen :-)

 And a good sign that something has gone right :)


Maybe :-) but people do things for many other reason that they're right.
An alternative explanation i like is
http://inamidst.com/whits/2008/technobunkum


I think LOD achievement is enormous and this is only the start.
 :)


Said picture might have helped to get a lot of  RDF data online. This is
undoubtly a great achievement
But sustenability and real growth comes only if we can prove real reasons
for people to publish this data, and in this way. While we havent yet seen
this, this doesnt mean that some application might not exist

Giovanni


AW: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Chris Bizer
Hi Kingsley,

 You have MySpace and Flickr Wrappers but still don't include all the 
 Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?

The last time I checked the cartridges, I had the impression that they were
not very much interlinked with the rest of the LOD cloud and about half of
them were down.

The links to DBpedia were rather strange. For instance the first link I
found owl:sameAs'ed Yahoo finance balance sheet with Dbpedia bed sheet,
which is even for me as a big supporter of owl:sameAs links a bit too much
Semantic gap.

Did you improve the quality of the external links in the meantime and do the
cartridges regularly deliver data?

Another problem is that many of the sources are not really Open Data as
various license restrictions apply.

Cheers

Chris


-Ursprüngliche Nachricht-
Von: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] Im Auftrag
von Kingsley Idehen
Gesendet: Samstag, 28. Februar 2009 00:18
An: Anja Jentzsch
Cc: public-lod@w3.org
Betreff: Re: New LOD Cloud - Please send us links to missing data sources

Anja Jentzsch wrote:
 Hi all,

 we are currently updating the LOD cloud. Find the draft here: 
 http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png

 We have already added:

 1. the RKBExplorer cloud
 2. the Bio2RDF cloud
 3. the LODD cloud
 4. GeoSpecies
 5. LIBRIS

 Statistics on triple and link count (as well as target sources) are 
 missing for the following sources:
 - Faviki
 - RDFohloh
 - OpenCalais
 - LIBRIS

 Did we forget any new data sources or links between data sources?

 Keep in mind: A data source qualifies for the cloud, if the data is 
 available via dereferencable URIs and if the data source is 
 interlinked with at least one other source (meaning it references URIs 
 within the namespace of the other source).

 Anja


Anja,

You have MySpace and Flickr Wrappers but still don't include all the 
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?

Also, the LODD data sets page should be linked to: 
http://esw.w3.org/topic/DataSetRDFDumps, so we can track down the dumps 
with ease re. the Virtuoso LOD hosting instance.

links:

1. http://virtuoso.openlinksw.com/images/sponger-cloud.html

-- 


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Ivan Herman

Anja,

as in previous steps, I am happy to generate an SVG version as soon as I 
get the final PDF version! Just ping me...


Cheers

Ivan

Anja Jentzsch wrote:

Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png


We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:

- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is interlinked 
with at least one other source (meaning it references URIs within the 
namespace of the other source).


Anja



--

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf


smime.p7s
Description: S/MIME Cryptographic Signature


Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Joshua Tauberer

Not related to this thread, but this just in:

The LOD cloud was just on a slide presented at 
http://transparencycamp.org for the future of http://www.recovery.gov, 
the website for tracking spending in the U.S.'s economic recovery 
package. Very thrilling to see it being taken seriously by the U.S. 
government.


Josh



Re: AW: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Kingsley Idehen

Chris Bizer wrote:

Hi Kingsley,

  
You have MySpace and Flickr Wrappers but still don't include all the 
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?



The last time I checked the cartridges, I had the impression that they were
not very much interlinked with the rest of the LOD cloud and about half of
them were down.
  

That was a point in time.

The links to DBpedia were rather strange. For instance the first link I
found owl:sameAs'ed Yahoo finance balance sheet with Dbpedia bed sheet,
which is even for me as a big supporter of owl:sameAs links a bit too much
Semantic gap.
  

We have 30 RDfizers written at different points in time.

If an RDFizer is imperfect (like most of LOD) simple feedback will do. I 
don't even remember the last time I looked the RDFizer for Yahoo Finance.


Okay, one bad and very very old cartridge, 29 to go :-)

Did you improve the quality of the external links in the meantime and do the
cartridges regularly deliver data?
  
If OpenLink has one attribute, I don't think  static would be the one 
:-) These cartridges changes on a daily basis depending on a myriad of 
circumstances (including feedback).


As with all things, feedback would be very helpful.

Another problem is that many of the sources are not really Open Data as
various license restrictions apply.
  
The ones that concern you re. open data to should be relayed to me, at 
the very least we don't want to be violating anyones publishing rules etc..


Kingsley


Cheers

Chris


-Ursprüngliche Nachricht-
Von: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] Im Auftrag
von Kingsley Idehen
Gesendet: Samstag, 28. Februar 2009 00:18
An: Anja Jentzsch
Cc: public-lod@w3.org
Betreff: Re: New LOD Cloud - Please send us links to missing data sources

Anja Jentzsch wrote:
  

Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png


We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:

- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is 
interlinked with at least one other source (meaning it references URIs 
within the namespace of the other source).


Anja




Anja,

You have MySpace and Flickr Wrappers but still don't include all the 
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?


Also, the LODD data sets page should be linked to: 
http://esw.w3.org/topic/DataSetRDFDumps, so we can track down the dumps 
with ease re. the Virtuoso LOD hosting instance.


links:

1. http://virtuoso.openlinksw.com/images/sponger-cloud.html

  



--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Dan Brickley

On 28/2/09 21:49, Joshua Tauberer wrote:

Not related to this thread, but this just in:

The LOD cloud was just on a slide presented at
http://transparencycamp.org for the future of http://www.recovery.gov,
the website for tracking spending in the U.S.'s economic recovery
package. Very thrilling to see it being taken seriously by the U.S.
government.


This is great news! And since my last couple of threads here have been 
grumbles about this or that detail, I just want to say congrats and 
kudos to all those who've made this happen. I think the cloud diagrams 
are proving a very compelling visual for people who don't care about 
nerdy detail but understand the idea of interlinked datasets.


BTW I also showed some of the clouds during a panel talk in last week's 
HURIDOCS conference on human rights documentation (Human Rights Council 
and the International Criminal Court: The New challenges for Human 
Rights Communications), http://www.huridocs.org/involved/conference/ 
... I'll post more on this after I blog the slides.


Also good to see this discussion over on the Sunlight Labs list, 
http://groups.google.com/group/sunlightlabs/browse_thread/thread/e2931af260241ed6/94be63985b224d70?lnk=gstq=rdf#94be63985b224d70


cheers,

Dan



Contd: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Kingsley Idehen

Chris Bizer wrote:

Hi Kingsley,

  
You have MySpace and Flickr Wrappers but still don't include all the 
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?



The last time I checked the cartridges, I had the impression that they were
not very much interlinked with the rest of the LOD cloud and about half of
them were down.
  

When was that? Please be specific.

The links to DBpedia were rather strange. For instance the first link I
found owl:sameAs'ed Yahoo finance balance sheet with Dbpedia bed sheet,
which is even for me as a big supporter of owl:sameAs links a bit too much
Semantic gap.
  
Even if this anomaly existed, how many minutes would it take to fix that 
via a constructive feedback loop, really?

Did you improve the quality of the external links in the meantime and do the
cartridges regularly deliver data?
  
You are being quite subjective here. The measure of quality is what 
though? Is the Flickr wrapper the quality measure? I am not being 
bombastic here, I need you to be a little more specific when making 
public comments.

Another problem is that many of the sources are not really Open Data as
various license restrictions apply.
  

Examples, assuming we are all in this together.

Kingsley

Cheers

Chris


-Ursprüngliche Nachricht-
Von: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] Im Auftrag
von Kingsley Idehen
Gesendet: Samstag, 28. Februar 2009 00:18
An: Anja Jentzsch
Cc: public-lod@w3.org
Betreff: Re: New LOD Cloud - Please send us links to missing data sources

Anja Jentzsch wrote:
  

Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png


We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:

- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is 
interlinked with at least one other source (meaning it references URIs 
within the namespace of the other source).


Anja




Anja,

You have MySpace and Flickr Wrappers but still don't include all the 
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?


Also, the LODD data sets page should be linked to: 
http://esw.w3.org/topic/DataSetRDFDumps, so we can track down the dumps 
with ease re. the Virtuoso LOD hosting instance.


links:

1. http://virtuoso.openlinksw.com/images/sponger-cloud.html

  



--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Kingsley Idehen

Joshua Tauberer wrote:

Not related to this thread, but this just in:

The LOD cloud was just on a slide presented at 
http://transparencycamp.org for the future of http://www.recovery.gov, 
the website for tracking spending in the U.S.'s economic recovery 
package. Very thrilling to see it being taken seriously by the U.S. 
government.


Josh



Joshua,

Great!

I think eGovt. is going to be a major area that showcases the virtues of 
Linked Data.


I think GovtTrack.us and Watchdog.net remain on the vanguard of Linked 
Data in the U.S. based eGovt. realm. Hopefully,  a Linked Govt Data 
Cloud -- with your collective endeavors at the core -- will take shape 
soon :-)


--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Ted Thibodeau Jr

Hi, Anja --

On Feb 27, 2009, at 05:58 PM, Anja Jentzsch wrote:
 we are currently updating the LOD cloud. Find the draft here:
 http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png 



It remains very pretty -- but it feels like a data silo of its own.

I can't speak for anyone else, but I find it difficult to read this
cloud -- I cannot see which data sets do well at out-linking, nor
which sets are apparently good for being out-linked *to*.

Where is the data behind the graph?

What are the triple counts for each set?

What are counts for each arrow?  (When the arrow is bidirectional,
I would expect 2 counts, 1 for each arrowhead.)

In October, when I asked about these, such numbers weren't used in
drawing the cloud -- so the sizes of the nodes and the weights of
the arcs are just artistic, and/or based on gut feel.  This use of
commonly understood graphing techniques (larger nodes for larger
data sets, thicker arcs for more links between) without any actual
data behind it troubles me.

Because of this, and because I wanted to easily see which data sets
made many links out, and which data sets were linked *to* a lot,
I made an alternative graphic -- which doesn't look so pretty (it's
much less suggestive of an actual cloud), but which I think is rather
more readable, and has no potential misinterpretation about data set
sizes or inter-link intensity, as all nodes are the same size and all
arcs are the same weight.

   http://virtuoso.openlinksw.com/images/dbpedia-lod-cloud.html

This version makes plain that there are some clusters within the
cloud -- and that DBpedia, MusicBrainz, and GeoNames are clearly
nexuses for these clusters, being the targets of many inter-links.
The new edition reveals several new nexuses, in ECS Southampton,
GeneID, UniProt, DBLP RKB Explorer, CiteSeer, and others.

At first glance, FOAF appears to be such a nexus, but it is an
odd case.

On Feb 27, 2009, at 05:58 PM, Anja Jentzsch wrote:
 Keep in mind: A data source qualifies for the cloud, if the
 data is available via dereferencable URIs and if the data
 source is interlinked with at least one other source (meaning
 it references URIs within the namespace of the other source).

I don't think FOAF properly belongs on a cloud of *data sources*,
because it's more of a vocabulary than a data-set -- more
analogous to Dublic Core than to the others here.  In other words,
it's more *data dictionary* than *instance data*.

There is no FOAF SPARQL endpoint, no FOAF RDF dump, etc., in
marked contrast to the others on this graphic.  The FOAF ontology
is used by *many* data sets which aren't represented here -- which
is why I used a cloud for FOAF (and for SIOC).  *Most* uses of
FOAF are one-off pages, and I don't think those can really be
considered data sources never mind data sets for purposes of
this diagram.

My intent (as time permits) is to build separate illustrations
of the data sets that make use of these vocabularies (e.g.,
LiveJournal, Facebook, Tribe [assuming revival], etc.).

Obviously, there are many more data set inter-links in the updated
cloud than the previous edition, so there will be more intersections
in mine, too.  But if you compare mine to the edition it was based
on, you'll see that there's a lot less inter-linking in reality
than a casual glance at this original suggests --

   http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2008-09-18_blank.png 



I think exposing where the gaps are is *much* more likely to lead to
them being closed, than making them invisible.

I'd be happy to update mine -- and happier still to make the node sizes
and arc weights proportional to the data -- but I can no longer reverse
engineer the connections simply by looking at the cloud image.

Can we get the raw data, please?

Be seeing you,

Ted



--
A: Yes.  http://www.guckes.net/faq/attribution.html
| Q: Are you sure?
| | A: Because it reverses the logical flow of conversation.
| | | Q: Why is top posting frowned upon?

Ted Thibodeau, Jr.   //   voice +1-781-273-0900 x32
Evangelism  Support //mailto:tthibod...@openlinksw.com
OpenLink Software, Inc.  //  http://www.openlinksw.com/
 http://www.openlinksw.com/weblogs/uda/
OpenLink Blogs  http://www.openlinksw.com/weblogs/virtuoso/
   http://www.openlinksw.com/blog/~kidehen/
Universal Data Access and Virtual Database Technology Providers






Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Juan Sequeda
Hi all!

FYI, the slides that Josua was talking about are here:
http://george.thomas.name/omb/

Here is a blog post about how government agencies need to report their
spendings in RSS. (http://www.aaronsw.com/weblog/rssstimulus) The data is
structured!

There will be a OpenGov Ignite during SXSW in Austin and I will be on a
panel for Linked Data. Interesting things are coming up around eGov!

Juan Sequeda, Ph.D Student
Dept. of Computer Sciences
The University of Texas at Austin
www.juansequeda.com
www.semanticwebaustin.org


2009/2/28 Kingsley Idehen kide...@openlinksw.com

 Joshua Tauberer wrote:

 Not related to this thread, but this just in:

 The LOD cloud was just on a slide presented at
 http://transparencycamp.org for the future of http://www.recovery.gov,
 the website for tracking spending in the U.S.'s economic recovery package.
 Very thrilling to see it being taken seriously by the U.S. government.

 Josh


  Joshua,

 Great!

 I think eGovt. is going to be a major area that showcases the virtues of
 Linked Data.

 I think GovtTrack.us and Watchdog.net remain on the vanguard of Linked Data
 in the U.S. based eGovt. realm. Hopefully,  a Linked Govt Data Cloud -- with
 your collective endeavors at the core -- will take shape soon :-)


 --


 Regards,

 Kingsley Idehen   Weblog: 
 http://www.openlinksw.com/blog/~kidehenhttp://www.openlinksw.com/blog/%7Ekidehen
 President  CEO OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Kingsley Idehen

Ted Thibodeau Jr wrote:

Hi, Anja --

On Feb 27, 2009, at 05:58 PM, Anja Jentzsch wrote:
 we are currently updating the LOD cloud. Find the draft here:
 http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png

It remains very pretty -- but it feels like a data silo of its own.

I can't speak for anyone else, but I find it difficult to read this
cloud -- I cannot see which data sets do well at out-linking, nor
which sets are apparently good for being out-linked *to*.

Where is the data behind the graph?

What are the triple counts for each set?

What are counts for each arrow?  (When the arrow is bidirectional,
I would expect 2 counts, 1 for each arrowhead.)

In October, when I asked about these, such numbers weren't used in
drawing the cloud -- so the sizes of the nodes and the weights of
the arcs are just artistic, and/or based on gut feel.  This use of
commonly understood graphing techniques (larger nodes for larger
data sets, thicker arcs for more links between) without any actual
data behind it troubles me.

Because of this, and because I wanted to easily see which data sets
made many links out, and which data sets were linked *to* a lot,
I made an alternative graphic -- which doesn't look so pretty (it's
much less suggestive of an actual cloud), but which I think is rather
more readable, and has no potential misinterpretation about data set
sizes or inter-link intensity, as all nodes are the same size and all
arcs are the same weight.

   http://virtuoso.openlinksw.com/images/dbpedia-lod-cloud.html

This version makes plain that there are some clusters within the
cloud -- and that DBpedia, MusicBrainz, and GeoNames are clearly
nexuses for these clusters, being the targets of many inter-links.
The new edition reveals several new nexuses, in ECS Southampton,
GeneID, UniProt, DBLP RKB Explorer, CiteSeer, and others.

At first glance, FOAF appears to be such a nexus, but it is an
odd case.

On Feb 27, 2009, at 05:58 PM, Anja Jentzsch wrote:
 Keep in mind: A data source qualifies for the cloud, if the
 data is available via dereferencable URIs and if the data
 source is interlinked with at least one other source (meaning
 it references URIs within the namespace of the other source).

Ted,


I don't think FOAF properly belongs on a cloud of *data sources*,
because it's more of a vocabulary than a data-set -- more
analogous to Dublic Core than to the others here.  In other words,
it's more *data dictionary* than *instance data*.

There is no FOAF SPARQL endpoint, no FOAF RDF dump, etc., in
marked contrast to the others on this graphic.  The FOAF ontology
is used by *many* data sets which aren't represented here -- which
is why I used a cloud for FOAF (and for SIOC).  *Most* uses of
FOAF are one-off pages, and I don't think those can really be
considered data sources never mind data sets for purposes of
this diagram.
This would be much clearer if they simply indicated that is implied a 
FOAF Profile cluster or Data Space as per:

http://esw.w3.org/topic/FoafSites . The same thing applies to SIOC.



My intent (as time permits) is to build separate illustrations
of the data sets that make use of these vocabularies (e.g.,
LiveJournal, Facebook, Tribe [assuming revival], etc.).

The Dictionary/Vocabulary/Schema (TBox) side is fine, which is basically 
what drove the creation of the UMBEL cloud. Now the LOD cloud and the 
UMBEL clouds actually come together via the Sponger Cloud since it uses 
terms from the dictionaries in the UMBEL Cloud and links where 
appropriate to instance data in the LOD cloud.


Examples:

1. Crunchbase - maps on the fly to DBpedia (always has)
2. Wikipedia - which does the DBpedia extraction on the fly, against 
Wikipedia source data when it determines a deltas between Wikipedia  and 
DBpedia
3. XBRL - both Joshua's stuff and WikiCompany are looked up (*note, this 
cartridge needs fixing and enhancing, but its still a zillion times 
better than zilch*)



Obviously, there are many more data set inter-links in the updated
cloud than the previous edition, so there will be more intersections
in mine, too.  But if you compare mine to the edition it was based
on, you'll see that there's a lot less inter-linking in reality
than a casual glance at this original suggests --

   
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2008-09-18_blank.png 

The LOD cloud is more about marketing collateral for presentations. 
Don't take it that seriously in a pure linkage sense. I am only 
commenting because of the marketing aspects of this diagram.


I think exposing where the gaps are is *much* more likely to lead to
them being closed, than making them invisible.

I'd be happy to update mine -- and happier still to make the node sizes
and arc weights proportional to the data -- but I can no longer reverse
engineer the connections simply by looking at the cloud image.

Can we get the raw data, please?

Be seeing you,

Ted






--


Regards,

Kingsley Idehen   Weblog: 

Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Kingsley Idehen

Juan Sequeda wrote:

Hi all!

FYI, the slides that Josua was talking about are here: 
http://george.thomas.name/omb/


Here is a blog post about how government agencies need to report their 
spendings in RSS. (http://www.aaronsw.com/weblog/rssstimulus) The data 
is structured!


There will be a OpenGov Ignite during SXSW in Austin and I will be on 
a panel for Linked Data. Interesting things are coming up around eGov!

Juan,

Thanks!

Great material indeed.

I've been trying to locate the actual presentation material since the 
initial posts on Twitter etc..


Kingsley


Juan Sequeda, Ph.D Student
Dept. of Computer Sciences
The University of Texas at Austin
www.juansequeda.com http://www.juansequeda.com
www.semanticwebaustin.org http://www.semanticwebaustin.org


2009/2/28 Kingsley Idehen kide...@openlinksw.com 
mailto:kide...@openlinksw.com


Joshua Tauberer wrote:

Not related to this thread, but this just in:

The LOD cloud was just on a slide presented at
http://transparencycamp.org for the future of
http://www.recovery.gov, the website for tracking spending in
the U.S.'s economic recovery package. Very thrilling to see it
being taken seriously by the U.S. government.

Josh


Joshua,

Great!

I think eGovt. is going to be a major area that showcases the
virtues of Linked Data.

I think GovtTrack.us and Watchdog.net remain on the vanguard of
Linked Data in the U.S. based eGovt. realm. Hopefully,  a Linked
Govt Data Cloud -- with your collective endeavors at the core --
will take shape soon :-)


-- 



Regards,

Kingsley Idehen   Weblog:
http://www.openlinksw.com/blog/~kidehen
http://www.openlinksw.com/blog/%7Ekidehen
President  CEO OpenLink Software Web: http://www.openlinksw.com









--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Anja Jentzsch

Hey Ted,

Ted Thibodeau Jr schrieb:

Hi, Anja --

On Feb 27, 2009, at 05:58 PM, Anja Jentzsch wrote:
  we are currently updating the LOD cloud. Find the draft here:
  http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png

It remains very pretty -- but it feels like a data silo of its own.

I can't speak for anyone else, but I find it difficult to read this
cloud -- I cannot see which data sets do well at out-linking, nor
which sets are apparently good for being out-linked *to*.

Where is the data behind the graph?



What are the triple counts for each set?


Most of that data is taken from the ESW wiki page 
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets

or the corresponding data set pages / endpoints.


What are counts for each arrow?  (When the arrow is bidirectional,
I would expect 2 counts, 1 for each arrowhead.)


The link counts where also taken from the data set pages and in unclear 
cases I did verify link counts (and Richard did before).

If there are bidirectional links they count as one.


In October, when I asked about these, such numbers weren't used in
drawing the cloud -- so the sizes of the nodes and the weights of
the arcs are just artistic, and/or based on gut feel.  This use of
commonly understood graphing techniques (larger nodes for larger
data sets, thicker arcs for more links between) without any actual
data behind it troubles me.


There are some very clear rules for edge and node sizes.

Concerning the edges there are three sizes:
- thin for some 100 links
- middle for some 1000 up to 10.000 links
- thick for 100.000 and more

The nodes are divided into five sizes:
-  10K triples
- 10K+ triples
- 500K+ triples
- 10M+ triples
- 1B+ triples


Because of this, and because I wanted to easily see which data sets
made many links out, and which data sets were linked *to* a lot,
I made an alternative graphic -- which doesn't look so pretty (it's
much less suggestive of an actual cloud), but which I think is rather
more readable, and has no potential misinterpretation about data set
sizes or inter-link intensity, as all nodes are the same size and all
arcs are the same weight.

   http://virtuoso.openlinksw.com/images/dbpedia-lod-cloud.html


Sure, different views on the cloud are useful. Clustering by subject 
might be another one.
The provided cloud on the one hand shows the growth oft the cloud and 
tries to cluster it on the other.



This version makes plain that there are some clusters within the
cloud -- and that DBpedia, MusicBrainz, and GeoNames are clearly
nexuses for these clusters, being the targets of many inter-links.
The new edition reveals several new nexuses, in ECS Southampton,
GeneID, UniProt, DBLP RKB Explorer, CiteSeer, and others.

At first glance, FOAF appears to be such a nexus, but it is an
odd case.

On Feb 27, 2009, at 05:58 PM, Anja Jentzsch wrote:
  Keep in mind: A data source qualifies for the cloud, if the
  data is available via dereferencable URIs and if the data
  source is interlinked with at least one other source (meaning
  it references URIs within the namespace of the other source).

I don't think FOAF properly belongs on a cloud of *data sources*,
because it's more of a vocabulary than a data-set -- more
analogous to Dublic Core than to the others here.  In other words,
it's more *data dictionary* than *instance data*.

There is no FOAF SPARQL endpoint, no FOAF RDF dump, etc., in
marked contrast to the others on this graphic.  The FOAF ontology
is used by *many* data sets which aren't represented here -- which
is why I used a cloud for FOAF (and for SIOC).  *Most* uses of
FOAF are one-off pages, and I don't think those can really be
considered data sources never mind data sets for purposes of
this diagram.

My intent (as time permits) is to build separate illustrations
of the data sets that make use of these vocabularies (e.g.,
LiveJournal, Facebook, Tribe [assuming revival], etc.).

Obviously, there are many more data set inter-links in the updated
cloud than the previous edition, so there will be more intersections
in mine, too.  But if you compare mine to the edition it was based
on, you'll see that there's a lot less inter-linking in reality
than a casual glance at this original suggests --

   
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2008-09-18_blank.png 



I think exposing where the gaps are is *much* more likely to lead to
them being closed, than making them invisible.



I'd be happy to update mine -- and happier still to make the node sizes
and arc weights proportional to the data -- but I can no longer reverse
engineer the connections simply by looking at the cloud image.

Can we get the raw data, please?


I will provide SVG and other versions on the ESW wiki page when the 
cloud is final.


Cheers
Anja


Be seeing you,

Ted







Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Giovanni Tummarello
 congrats and kudos to all those who've made this happen. I think the cloud
 diagrams are proving a very compelling visual for people who don't care
 about nerdy detail but understand the idea of interlinked datasets.


Yes they're great for handwaving if the audience has never seen it,
otherwise its likely counterproductive

The problem is that LOD has been stuck here 2 years really now, not a single
advance not a single application (of the LOD model, not of the data, the
data is obviously useful and expressing in RDF is also starting to be seen
as useful) .

That the bubbles continue to grown is however a sociological interesting
phenomen :-)

On the positive side,  i recently reviewed some work by someone who has a
very interesting way to create a diagram which actually helps by showing
which queries can be asked.  Too bad you wont see it in action at ESWC
because the demo paper was  not up to the springer standards for
legibility, according to some other reviewer.

Giovanni


Re: New LOD Cloud - Please send us links to missing data sources

2009-02-28 Thread Dan Brickley

On 1/3/09 01:30, Giovanni Tummarello wrote:


congrats and kudos to all those who've made this happen. I think the
cloud diagrams are proving a very compelling visual for people who
don't care about nerdy detail but understand the idea of interlinked
datasets.

Yes they're great for handwaving if the audience has never seen it,
otherwise its likely counterproductive

The problem is that LOD has been stuck here 2 years really now, not a
single advance not a single application (of the LOD model, not of the
data, the data is obviously useful and expressing in RDF is also
starting to be seen as useful) .


Well, it *is* all about the data. Don't forget that! If the cloud 
diagrams only serve to remind people that many many datasets overlap in 
scope, and can be aggregated into larger units wherever they mention 
common objects and use common vocabulary, all is well. We don't need 
intelligent mobile agents for this to pay off. Just nice big databases 
and good old fashioned code.


LOD is an elaboration and improvement on the original linking model we 
had in FOAF (back before RDFCore when the RDF spec was vague on some key 
points, like how many URIs a thing could have, how to model 
same-thing-ness, ...). The main reason for rdfweb (as I called it 
originally) was discovery. In 2000, there was basically no RDF in the 
public Web, apart from some half-hearted bits of Dublin Core. No search 
engines did anything with it then (vs today, with Yahoo, Google, Yandex, 
Nutch, Sindice, SWSE et al.). So having an information linking model for 
RDF was important: it meant we could pretty much find all the RDF in the 
public Web by starting at one FOAF file and crawling. I think LOD has 
similar value today, but the pressure to have a hypertext-based 
discovery model is somewhat reduced. Partly because dataset-level 
information is available (eg. URL templates for LiveJournal, or VOID for 
LOD sites), and these tell coders how they can get their hands on huge 
chunks of data. But also because there are more aggregators and lookup 
tools.


Even if most apps work only with a single dataset, linking is 
worthwhile. It reduces the degree of coupling between app and dataset, 
by increasing the commonalities between datasets. And it's a nice hook 
for crawlers, who can then expose different aggregate views back as more 
bubbles.



That the bubbles continue to grown is however a sociological interesting
phenomen :-)


Nothing wrong with sociology! :)

I think as SKOS gets rolled out more seriously, linkage by topic (eg. 
LCSH, Dbpedia, ...) will become worth its own custom visualisation...



On the positive side,  i recently reviewed some work by someone who has
a very interesting way to create a diagram which actually helps by
showing which queries can be asked.  Too bad you wont see it in action
at ESWC because the demo paper was not up to the springer standards for
legibility, according to some other reviewer.


The problem here imho is that too many people have forgotten that it is 
the Semantic Web project, and instead treat Semantic Web as the name 
for a research field, or for a hypothetical future version of the Web 
that may never exist.


cheers,

Dan



New LOD Cloud - Please send us links to missing data sources

2009-02-27 Thread Anja Jentzsch

Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png


We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:

- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is interlinked 
with at least one other source (meaning it references URIs within the 
namespace of the other source).


Anja



Re: New LOD Cloud - Please send us links to missing data sources

2009-02-27 Thread Kingsley Idehen

Anja Jentzsch wrote:

Hi all,

we are currently updating the LOD cloud. Find the draft here: 
http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-02-27.png


We have already added:

1. the RKBExplorer cloud
2. the Bio2RDF cloud
3. the LODD cloud
4. GeoSpecies
5. LIBRIS

Statistics on triple and link count (as well as target sources) are 
missing for the following sources:

- Faviki
- RDFohloh
- OpenCalais
- LIBRIS

Did we forget any new data sources or links between data sources?

Keep in mind: A data source qualifies for the cloud, if the data is 
available via dereferencable URIs and if the data source is 
interlinked with at least one other source (meaning it references URIs 
within the namespace of the other source).


Anja



Anja,

You have MySpace and Flickr Wrappers but still don't include all the 
Virtuoso Sponger Cartridges (which are wrappers) to the Cloud [1] ?


Also, the LODD data sets page should be linked to: 
http://esw.w3.org/topic/DataSetRDFDumps, so we can track down the dumps 
with ease re. the Virtuoso LOD hosting instance.


links:

1. http://virtuoso.openlinksw.com/images/sponger-cloud.html

--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: New LOD Cloud - Please send us links to missing data sources

2008-09-19 Thread [EMAIL PROTECTED]

On 19 Sep 2008, at 13:07, [EMAIL PROTECTED] wrote:



but by that token you could probably wipe out most of foaf and doap
space from the diagram

Most of that data is not very linky and many primary resources being
described don't have uris


Sure, I am guessing you are talking about foaf:knowing b-nodes, but  
the use of a rdfs:seeAlso with a IFP, should satisfy the notion of  
being linky, or am I wrong in thinking this?


Cheers,

Mischa




On 9/19/08, Tom Heath [EMAIL PROTECTED] wrote:


Hey Mischa,

Good to hear you :)

Just to add to what Peter said, last time I checked LiveJournal was
not very Linked Data-friendly, which is a shame, naturally, as they
were well ahead of the curve with the FOAF export.

Cheers,

Tom.



2008/9/19 Peter Ansell [EMAIL PROTECTED]:

- [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:


From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: public-lod@w3.org
Sent: Friday, September 19, 2008 1:55:07 AM GMT +10:00 Brisbane
Subject: Re: New LOD Cloud - Please send us links to missing data  
sources


Hello,


There doesnt seem to be any mention of the LiveJournal or any of  
the
livejournal powered blogging sites, such as: vox, friendfeed, hi5  
to

name a few.


I think they are implicitly in the FOAF cloud, for want of a better
description of that node ;)

Cheers,

Peter

Find out more about Talis at  www.talis.com
Shared InnovationTM


Any views or personal opinions expressed within this email may not  
be
those of Talis Information Ltd. The content of this email message  
and any
files that may be attached are confidential, and for the usage of  
the

intended recipient only. If you are not the intended recipient, then
please return this message to the sender and delete it. Any use of  
this

e-mail by an unauthorised recipient is prohibited.


Talis Information Ltd is a member of the Talis Group of companies  
and is
registered in England No 3638278 with its registered office at  
Knights

Court, Solihull Parkway, Birmingham Business Park, B37 7YB.

__
This email has been scanned by the MessageLabs Email Security  
System.

For more information please visit http://www.messagelabs.com/email
__








___
Mischa Tuffield
Email: [EMAIL PROTECTED]
Homepage - http://users.ecs.soton.ac.uk/mmt04r/
FOAF - http://users.ecs.soton.ac.uk/mmt04r/foaf.rdf



Re: New LOD Cloud - Please send us links to missing data sources

2008-09-19 Thread Ian Davis
I wonder if we could highlight those doing a great job in this space more,
e.g. I believe Opera's  foaf output is LOD

On Fri, Sep 19, 2008 at 1:45 PM, Tom Heath [EMAIL PROTECTED] wrote:

 Sad but true. Things are improving in my experience but we still have
 some evangelism to do in this area.

 On 19/09/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  but by that token you could probably wipe out most of foaf and doap
  space from the diagram
 
  Most of that data is not very linky and many primary resources being
  described don't have uris
 
  On 9/19/08, Tom Heath [EMAIL PROTECTED] wrote:
 
  Hey Mischa,
 
  Good to hear you :)
 
  Just to add to what Peter said, last time I checked LiveJournal was
  not very Linked Data-friendly, which is a shame, naturally, as they
  were well ahead of the curve with the FOAF export.
 
  Cheers,
 
  Tom.
 
 
 
  2008/9/19 Peter Ansell [EMAIL PROTECTED]:
  - [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
  From: [EMAIL PROTECTED] [EMAIL PROTECTED]
  To: public-lod@w3.org
  Sent: Friday, September 19, 2008 1:55:07 AM GMT +10:00 Brisbane
  Subject: Re: New LOD Cloud - Please send us links to missing data
  sources
 
  Hello,
 
 
  There doesnt seem to be any mention of the LiveJournal or any of the
  livejournal powered blogging sites, such as: vox, friendfeed, hi5 to
  name a few.
 
  I think they are implicitly in the FOAF cloud, for want of a better
  description of that node ;)
 
  Cheers,
 
  Peter
 
  Find out more about Talis at  www.talis.com
  Shared InnovationTM
 
 
  Any views or personal opinions expressed within this email may not be
  those of Talis Information Ltd. The content of this email message and
 any
  files that may be attached are confidential, and for the usage of the
  intended recipient only. If you are not the intended recipient, then
  please return this message to the sender and delete it. Any use of this
  e-mail by an unauthorised recipient is prohibited.
 
 
  Talis Information Ltd is a member of the Talis Group of companies and
 is
  registered in England No 3638278 with its registered office at Knights
  Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
 
  __
  This email has been scanned by the MessageLabs Email Security System.
  For more information please visit http://www.messagelabs.com/email
  __
 
 
 
 
  Find out more about Talis at  www.talis.com
  Shared InnovationTM
 
 
  Any views or personal opinions expressed within this email may not be
 those
  of Talis Information Ltd. The content of this email message and any files
  that may be attached are confidential, and for the usage of the intended
  recipient only. If you are not the intended recipient, then please return
  this message to the sender and delete it. Any use of this e-mail by an
  unauthorised recipient is prohibited.
 
 
  Talis Information Ltd is a member of the Talis Group of companies and is
  registered in England No 3638278 with its registered office at Knights
  Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
 
  __
  This email has been scanned by the MessageLabs Email Security System.
  For more information please visit http://www.messagelabs.com/email
  __
 



Re: New LOD Cloud - Please send us links to missing data sources

2008-09-18 Thread [EMAIL PROTECTED]

Hello,

There doesnt seem to be any mention of the LiveJournal or any of the  
livejournal powered blogging sites, such as: vox, friendfeed, hi5 to  
name a few.


And you are missing links from QDOS to dbpedia :)

Cheers,

Mischa

On 17 Sep 2008, at 15:14, Anja Jentzsch wrote:


Hi all,

thanks for all the input.

Find the updated LOD cloud attached.

We added:
6. Surge Radio
7. MySpace Wrapper
8. BBC Programmes
9. BBC Placount Data
and several new connections between the datasets

We are still in contact with Kingsley on adding the mentioned  
wrappers to the cloud.


Anything still missing?

Anja

Chris Bizer schrieb:

Hi all,
Anja and I are currently updating the LOD cloud for the ESW  
wikipage. Draft attached.

Up till now we have added:
1.CrunchBase
2. LinkedMDB
3. YAGO
4. UMBEL
5. the PubGuide
It nice to see that fitting everything into one diagram is getting  
increasingly difficult as the cloud grows :-)

Did we forget any new data sources or links between data sources?
As discussed before: A data source qualifies for the cloud, if the  
data is available via dereferencable URIs and if the data source is  
interlinked with at least one other source (meaning it references  
URIs within the namespace of the other source).

Any feedback highly welcome.
Cheers
Chris
--
Prof. Dr. Chris Bizer
Freie Universität Berlin
Phone: +49 30 838 55509
Mail: [EMAIL PROTECTED]
Web: www.bizer.de


lod-datasets_2008-09-17.png


___
Mischa Tuffield
Email: [EMAIL PROTECTED]
Homepage - http://users.ecs.soton.ac.uk/mmt04r/
FOAF - http://users.ecs.soton.ac.uk/mmt04r/foaf.rdf



Re: New LOD Cloud - Please send us links to missing data sources

2008-09-17 Thread Paul Miller
Can I echo Tom's Looking forward to the new diagram. If it was  
available before Friday that would be great, as I'll be giving a talk  
and would love to show the enlarged cloud ;) ???


I don't need it till Monday, though...  ;-)

Paul

--
Paul Miller
Technology Evangelist, Talis
w: www.talis.com/  skype: napm1971
mobile/cell: +44 7769 740083

http://blogs.zdnet.com/semantic-web/

www.linkedin.com/in/pau1mi11er




On 17 Sep 2008, at 13:05, Tom Heath wrote:



Hi Chris,

A couple of additions/questions:

- The links between Revyu and the SW Conference Corpus are two-way
- Should Ontoworld be renamed SemanticWeb.org?
- Revyu and SemanticWeb.org have two-way links
- Is the BBC stuff going to be added?

Looking forward to the new diagram. If it was available before Friday
that would be great, as I'll be giving a talk and would love to show
the enlarged cloud ;)

Cheers,

Tom.


2008/9/16 Chris Bizer [EMAIL PROTECTED]:

Hi all,

Anja and I are currently updating the LOD cloud for the ESW  
wikipage. Draft

attached.

Up till now we have added:

1.CrunchBase
2. LinkedMDB
3. YAGO
4. UMBEL
5. the PubGuide

It nice to see that fitting everything into one diagram is getting
increasingly difficult as the cloud grows :-)

Did we forget any new data sources or links between data sources?

As discussed before: A data source qualifies for the cloud, if the  
data is
available via dereferencable URIs and if the data source is  
interlinked with
at least one other source (meaning it references URIs within the  
namespace

of the other source).

Any feedback highly welcome.

Cheers

Chris




--
Prof. Dr. Chris Bizer
Freie Universität Berlin
Phone: +49 30 838 55509
Mail: [EMAIL PROTECTED]
Web: www.bizer.de

Find out more about Talis at  www.talis.com
Shared InnovationTM


Any views or personal opinions expressed within this email may not  
be those
of Talis Information Ltd. The content of this email message and any  
files
that may be attached are confidential, and for the usage of the  
intended
recipient only. If you are not the intended recipient, then please  
return
this message to the sender and delete it. Any use of this e-mail by  
an

unauthorised recipient is prohibited.


Talis Information Ltd is a member of the Talis Group of companies  
and is
registered in England No 3638278 with its registered office at  
Knights

Court, Solihull Parkway, Birmingham Business Park, B37 7YB.

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
__