Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Denny Vrandecic
I usually dislike to comment on such discussions, as I don't find them 
particularly productive,  but 1) since the number of people pointing me to this 
thread is growing, 2) it contains some wrong statements, and 3) I feel that 
this thread has been hijacked from a topic that I consider productive and 
important, I hope you won't mind me giving a comment. I wanted to keep it 
brief, but I failed.

Let's start with the wrong statements:

First, although I take responsibility as a co-creator for Linked Open Numbers, 
I surely cannot take full credit for it. The dataset was a shared effort by a 
number of people in Karlsruhe over a few days, and thus calling the whole thing 
Denny's numbers dataset is simply wrong due to the effort spent by my 
colleagues on it. It is fine to call it Karlsruhe's numbers dataset or simply 
Linked Open Numbers, but providing me with the sole attribution is too much of 
an honor.

Second, although it is claimed that Linked Open Numbers are by design and 
known to everybody in the core community, not data but noise, being one of the 
co-designers of the system I have to disagree. It is noise by design. One of 
my motivations for LON was to raise a few points for discussion, and at the 
same time provide with a dataset fully adhering to Linked Open Data principles. 
We were obviously able to get the first goal right, and we didn't do too bad on 
the second, even though we got an interesting list of bugs by Richard Cyganiak, 
which, pitily, we still did not fix. I am very sorry for that. But, to make the 
point very clear again, this dataset was designed to follow LOD principles as 
good as possible, to be correct, and to have an implementation that is so 
simple that we are usually up, so anyone can use LON as a testing ground. Due 
to a number of mails and personal communications I know that LON has been used 
in that sense, and some developers even found it useful for other features, 
like our provision of number names in several languages. So, what is called 
noise by design here, is actually an actively used dataset, that managed to 
raise, as we have hoped, discussions about the point of counting triples, was a 
factor in the discussion about literals as subjects, made us rethink the notion 
of semantics and computational properties of RDF entities in a different way, 
and is involved in the discussion about quality of LOD. With respect to that, 
in my opinion, LON has achieved and exceeded its expectations, but I understand 
anyone who disagrees. Besides that, it was, and is, huge fun.

Now to some topics of the discussion:

On the issue of the LOD cloud diagram. I want to express my gratitude to all 
the people involved, for the effort they voluntarily put in its development and 
maintenance. I find it especially great, that it is becoming increasingly 
transparent how the diagram is created and how the datasets are selected. Chris 
has refered to a set of conditions that are expected for inclusion, and before 
the creation of the newest iteration there was an explicit call on this mailing 
list to gather more information. I can only echo the sentiment that if someone 
is unhappy with that diagram, they are free to create their own and put it 
online. The data is available, the SVG is available and editable, and they use 
licenses that allow the modification and republishing.

Enrico is right that a system like Watson (or Sindice), that automatically 
gathers datasets from the Web instead of using a manually submitted and managed 
catalog, will probably turn out to be the better approach. Watson used to have 
an overview with statistics on its current content, and I really loved that 
overview, but this feature has been disabled since a few months. If it was 
available, especially in any graphical format that can be easily reused in 
slides -- for example, graphs on the growth of number of triples, datasets, 
etc., graphs on the change of cohesion, vocabulary reuse, etc. over time, 
within the Watson corpus -- I have no doubts that such graphs and data would be 
widely reused, and would in many instances replace the current usage of the 
cloud diagram. (I am furthermore curious about Enrico's statement that the 
Semantic Web =/= Linked Open Data and wonder about what he means here, but that 
is a completely different thread).

Finally, to what I consider most important in this thread:

I also find it a shame, that this thread has been hijacked, especially since 
the original topic was so interesting. The original email by Anja was not about 
the LOD cloud, but rather about -- as the title of the thread still suggests -- 
the compliance of LOD with some best practices. Instead of the question is X 
in the diagram, I would much rather see a discussion on are the selected 
quality criteria good criteria? why are some of them so little followed? how 
can we improve the situation? Anja has pointed to a wealth of openly available 
numbers (no pun intended), that have not been 

Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Denny Vrandecic

On Oct 21, 2010, at 23:43, Denny Vrandecic wrote:
 Second, although it is claimed that Linked Open Numbers are by design and 
 known to everybody in the core community, not data but noise, being one of 
 the co-designers of the system I have to disagree. It is noise by design.

Even though I reread my message before sending, I missed the quite relevant 
not in the second sentence. It should read It is not noise by design. :P

Cheers,
Denny :)


Re: Ontos links to LOD

2010-10-22 Thread Christian Ehrlich
Hugh,

I'm also not a cURL specialist, but I assume the problem is the # in our
GUID. Unfortunately, due to historical reasons, we have the hash sign in
there and it is a real part of the GUID, not a fragment identifier. You
would have to url-encode our GUID or replace # with %23 manually in order to
get it to run properly. Try to do this in
http://www.sameas.org/?uri=http://dbpedia.org/resource/Berners-Lee for the
ontosearch.com-URL - it will work.

Concerning your question concerning additional sameAs data, wait for my
answer to Alex.

Best,
Christian





AW: Ontos links to LOD

2010-10-22 Thread Christian Ehrlich
Hi Alex, Hugh,

You can use our API to query the db using a DBpedia (or Freebase etc.)
identifier. Here an example (for Barack Obama):

http://news.ontos.com/api/ontology?query={get:attrents,offset:0,limit
:30,typeFilter:http://www.ontosearch.com/2008/02/ontosminer-ns/domain/co
mmon/english#Person,
attrnames:[http://www.ontosearch.com/2008/02/ontosminer-ns/domain/common/
english/dbpedia#sameAs],attrvals:[http://dbpedia.org/resource/Barack_Oba
ma]}

Just now you have to know the Ontos type for doing this, i. e. you have to
match DBpedia types to Ontos types by yourself. We are planning to extend
this functionality in order to make this more simple.

You can of course use our API to make queries by object name or fulltext and
use resulting objects as starting points for exploring the db, i. e. to
find additional ones. Also see Kingsley's hint.

Regards,
Christian


 -Ursprüngliche Nachricht-
 Von: Alexandre Passant [mailto:alexandre.pass...@deri.org]
 Gesendet: Donnerstag, 21. Oktober 2010 15:27
 An: Christian Ehrlich
 Cc: public-lod@w3.org
 Betreff: Re: Ontos links to LOD
 
 Hi Christian,
 
 On 20 Oct 2010, at 10:33, Christian Ehrlich wrote:
 
  Dear all,
 
  Please note that Ontos is about to integrate its news portal
  [http://news.ontos.com] into the Linked Data Cloud. Ontos' GUIDs for
 objects
  are now dereferencable - the resulting RDF contains owl:sameAs-
 attributes to
  DBpedia, Freebase and others (check e. g. the entry for Barack Obama
  [http://www.ontosearch.com/2008/01/rdf/EID-
 2e70185c38e929aa90049982de43414c]
  ).
 
  Within the portal Ontos crawls news articles from diverse online
 sources,
  uses its cutting-edge NLP technology to extract facts (objects and
 relations
  between them), merges these information with existing ones and stores
 them
  including respective references to the original news article - all of
 this
  fully automatically. Facts from Ontos' portal are accessible via a
 RESTful
  HTTP API. Fetching data is free - in order to receive an API key,
 developers
  have to register (e-mail address only!) at Ontos' homepage
  [http://www.ontos.com].
 
 
 Is there a way to query the API with a DBpedia identifier rather than
 an Ontos one ?
 Or at least, is there somewhere a DBpedia 2 Ontos service (or a SPARQL
 endpoint where I can get that information) ?
 
 Thanks
 
 Alex.
 
  For humans Ontos provides a search interface at
 http://www.ontosearch.com.
  It allows to look-up objects in the database and viewing respective
  summaries in HTML or RDF.
 
  The generated RDF does currently contain a small part of existing
  information (e. g. no article references yet) and owl:sameAs is only
  supported for Persons and Organizations. Ontos will extend the
 respective
  content step-by-step.
 
  Any tests with our API as well as comments are highly appreciated.
 
  Regards,
  Christian
 
  --
  Christian Ehrlich
  Ontos AG
 
  Telefon: +49 341 21559-10
  Telefax: +49 341 21559-11
  Mobil:   +49 173 8745000
  christian.ehrl...@ontos.com
  http://www.ontos.com
 
 
 
 
 
 
 --
 Dr. Alexandre Passant
 Digital Enterprise Research Institute
 National University of Ireland, Galway
 :me owl:sameAs http://apassant.net/alex .
 
 
 
 





AW: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Chris Bizer
Hi Denny,

thank you for your smart and insightful comments.

 I also find it a shame, that this thread has been hijacked, especially
since the
 original topic was so interesting. The original email by Anja was not
about the
 LOD cloud, but rather about -- as the title of the thread still suggests
-- the
 compliance of LOD with some best practices. Instead of the question is X
in
 the diagram, I would much rather see a discussion on are the selected
 quality criteria good criteria? why are some of them so little followed?
how
 can we improve the situation? 

Absolutely. Opening up the discussion on these topics is exactly the reason
why we compiled the statistics.

In order to guide the discussion back to this topic, maybe it is useful to
repost the original link:

http://www4.wiwiss.fu-berlin.de/lodcloud/state/

A quick initial comment concerning the term quality criteria. I think it
is essential to distinguish between:

1. The quality of the way data is published, meaning to which extend the
publishers comply with best practices (a possible set of best practices is
listed in the document)
2. The quality of the data itself. I think Enrico's comment was going into
this direction.

The Web of documents is an open system built on people agreeing on standards
and best practices.
Open system means in this context that everybody can publish content and
that there are no restrictions on the quality of the content.
This is in my opinion one of the central facts that made the Web successful.

The same is true for the Web of Data. There obviously cannot be any
restrictions on what people can/should publish (including, different
opinions on a topic, but also including pure SPAM). As on the classic Web,
it is a job of the information/data consumer to figure out which data it
wants to believe and use (definition of information quality = usefulness of
information, which is a subjective thing). 

Thus it also does not make sense to discuss the objective quality of the
data that should be included into the LOD cloud (objective quality just does
not exist) and it makes much more sense to discuss the mayor issues that we
are still having in regard to the compliance with publishing best practices.

 Anja has pointed to a wealth of openly
 available numbers (no pun intended), that have not been discussed at all.
For
 example, only 7.5% of the data source provide a mapping of proprietary
 vocabulary terms to other vocabulary terms. For anyone building
 applications to work with LOD, this is a real problem.

Yes, this is also the figure that scared me most.

 but in order to figure out what really needs to be done, and
 how the criteria for good data on the Semantic Web need to look like, we
 need to get back to Anja's original questions. I think that is a question
we
 may try to tackle in Shanghai in some form, I at least would find that an
 interesting topic.

Same with me. 
Shanghai was also the reason for the timing of the post.

Cheers,

Chris

 -Ursprüngliche Nachricht-
 Von: semantic-web-requ...@w3.org [mailto:semantic-web-
 requ...@w3.org] Im Auftrag von Denny Vrandecic
 Gesendet: Freitag, 22. Oktober 2010 08:44
 An: Martin Hepp
 Cc: Kingsley Idehen; public-lod; Enrico Motta; Chris Bizer; Thomas
Steiner;
 Semantic Web; Anja Jentzsch; semanticweb; Giovanni Tummarello; Mathieu
 d'Aquin
 Betreff: Re: AW: ANN: LOD Cloud - Statistics and compliance with best
 practices
 
 I usually dislike to comment on such discussions, as I don't find them
 particularly productive,  but 1) since the number of people pointing me to
 this thread is growing, 2) it contains some wrong statements, and 3) I
feel
 that this thread has been hijacked from a topic that I consider productive
and
 important, I hope you won't mind me giving a comment. I wanted to keep it
 brief, but I failed.
 
 Let's start with the wrong statements:
 
 First, although I take responsibility as a co-creator for Linked Open
Numbers,
 I surely cannot take full credit for it. The dataset was a shared effort
by a
 number of people in Karlsruhe over a few days, and thus calling the whole
 thing Denny's numbers dataset is simply wrong due to the effort spent by
 my colleagues on it. It is fine to call it Karlsruhe's numbers dataset
or simply
 Linked Open Numbers, but providing me with the sole attribution is too
 much of an honor.
 
 Second, although it is claimed that Linked Open Numbers are by design and
 known to everybody in the core community, not data but noise, being one
 of the co-designers of the system I have to disagree. It is noise by
design.
 One of my motivations for LON was to raise a few points for discussion,
and
 at the same time provide with a dataset fully adhering to Linked Open Data
 principles. We were obviously able to get the first goal right, and we
didn't do
 too bad on the second, even though we got an interesting list of bugs by
 Richard Cyganiak, which, pitily, we still did not fix. I am very sorry for
that.
 But, to make the point very 

Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Martin Hepp
The Web of documents is an open system built on people agreeing on  
standards

and best practices.
Open system means in this context that everybody can publish content  
and

that there are no restrictions on the quality of the content.
This is in my opinion one of the central facts that made the Web  
successful.


+100


The same is true for the Web of Data. There obviously cannot be any
restrictions on what people can/should publish (including, different
opinions on a topic, but also including pure SPAM). As on the  
classic Web,
it is a job of the information/data consumer to figure out which  
data it
wants to believe and use (definition of information quality =  
usefulness of

information, which is a subjective thing).
+100


The fact that there is obviously a lot of low quality data on the  
current Web should not encourage us to publish masses of low-quality  
data and then celebrate ourselves for having achieved a lot. The  
current Web tolerates buggy markup, broken links, and questionable  
content of all types. But I hope everybody agrees that the Web is  
successful because of this tolerance, not because of the buggy content  
itself. Quite to the contrary, the Web has been broadly adopted  
because of the lots of commonly agreed high-quality contents.


If you continue to live the linked data landfill style it will fall  
back on you, reputation-wise, funding-wise, and career-wise. Some  
rules hold in ecosystems of all kinds and sizes.


Best

Martin




AW: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Chris Bizer
Hi Martin,

 The fact that there is obviously a lot of low quality data on the
 current Web should not encourage us to publish masses of low-quality
 data and then celebrate ourselves for having achieved a lot. The
 current Web tolerates buggy markup, broken links, and questionable
 content of all types. But I hope everybody agrees that the Web is
 successful because of this tolerance, not because of the buggy content
 itself. Quite to the contrary, the Web has been broadly adopted
 because of the lots of commonly agreed high-quality contents.

Sure, where is the problem? 

The same holds for the Web of Data: There is a lot of high quality content
and a lot of low quality content.
Which means - as on the classic Web - that the data consumer need to decide
which content it wants to use.

If the Web has proved anything than that having a completely open
architecture is a crucial factor for being able to succeed on global scale. 
The Web of Linked Data also aims at global scale. Thus, I will keep on
betting on open solutions without curation or any other bottle neck. 

 If you continue to live the linked data landfill style it will fall
 back on you, reputation-wise, funding-wise, and career-wise. Some
 rules hold in ecosystems of all kinds and sizes.

Sorry, you are leaving the grounds of scientific discussion here and I will
thus not comment.

Best,

Chris


 Best
 
 Martin





Low Quality Data (was before Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices)

2010-10-22 Thread Juan Sequeda
Martin and all,

Can somebody point me to papers or maybe give their definition of low
quality data when it comes to LOD. What is the criteria for data to be
considered low quality.

Thanks

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com


On Fri, Oct 22, 2010 at 9:01 AM, Martin Hepp 
martin.h...@ebusiness-unibw.org wrote:

  The Web of documents is an open system built on people agreeing on
 standards
 and best practices.
 Open system means in this context that everybody can publish content and
 that there are no restrictions on the quality of the content.
 This is in my opinion one of the central facts that made the Web
 successful.

 +100


 The same is true for the Web of Data. There obviously cannot be any
 restrictions on what people can/should publish (including, different
 opinions on a topic, but also including pure SPAM). As on the classic Web,
 it is a job of the information/data consumer to figure out which data it
 wants to believe and use (definition of information quality = usefulness
 of
 information, which is a subjective thing).
 +100


 The fact that there is obviously a lot of low quality data on the current
 Web should not encourage us to publish masses of low-quality data and then
 celebrate ourselves for having achieved a lot. The current Web tolerates
 buggy markup, broken links, and questionable content of all types. But I
 hope everybody agrees that the Web is successful because of this tolerance,
 not because of the buggy content itself. Quite to the contrary, the Web has
 been broadly adopted because of the lots of commonly agreed high-quality
 contents.

 If you continue to live the linked data landfill style it will fall back on
 you, reputation-wise, funding-wise, and career-wise. Some rules hold in
 ecosystems of all kinds and sizes.

 Best

 Martin




Re: Low Quality Data (was before Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices)

2010-10-22 Thread Leigh Dodds
Hi,

On 22 October 2010 15:47, Juan Sequeda juanfeder...@gmail.com wrote:

 Martin and all,
 Can somebody point me to papers or maybe give their definition of low quality 
 data when it comes to LOD. What is the criteria
 for data to be considered low quality.

I asked this in the context of Linked Data on semantic overflow:

http://www.semanticoverflow.com/questions/1072/quality-indicators-for-linked-data-datasets

Some good discussion and pointers in there.

Cheers,

L.

--
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.do...@talis.com
http://www.talis.com



Re: Low Quality Data (was before Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices)

2010-10-22 Thread Chris Bizer
Hi Juan,

 

 

Martin and all,

 

Can somebody point me to papers or maybe give their definition of low
quality data when it comes to LOD. What is the criteria for data to be
considered low quality.

 

An overview about the literature on data quality can be found in my PhD,
including the different definitions of the term and the like .

 

See:

 

http://www.diss.fu-berlin.de/diss/servlets/MCRFileNodeServlet/FUDISS_derivat
e_2736/02_Chapter2-Information-Quality.pdf?hosts=

also

http://www.diss.fu-berlin.de/2007/217/indexe.html

 

All this is from 2008. Thus, I guess there will also be newer stuff around,
but the text should properly reflect the state-of-the-art back then.

 

Cheers,

 

Chris

 

 

Thanks


Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com



On Fri, Oct 22, 2010 at 9:01 AM, Martin Hepp
martin.h...@ebusiness-unibw.org wrote:

The Web of documents is an open system built on people agreeing on standards
and best practices.
Open system means in this context that everybody can publish content and
that there are no restrictions on the quality of the content.
This is in my opinion one of the central facts that made the Web successful.

+100


The same is true for the Web of Data. There obviously cannot be any
restrictions on what people can/should publish (including, different
opinions on a topic, but also including pure SPAM). As on the classic Web,
it is a job of the information/data consumer to figure out which data it
wants to believe and use (definition of information quality = usefulness of
information, which is a subjective thing).
+100

 

The fact that there is obviously a lot of low quality data on the current
Web should not encourage us to publish masses of low-quality data and then
celebrate ourselves for having achieved a lot. The current Web tolerates
buggy markup, broken links, and questionable content of all types. But I
hope everybody agrees that the Web is successful because of this tolerance,
not because of the buggy content itself. Quite to the contrary, the Web has
been broadly adopted because of the lots of commonly agreed high-quality
contents.

If you continue to live the linked data landfill style it will fall back on
you, reputation-wise, funding-wise, and career-wise. Some rules hold in
ecosystems of all kinds and sizes.

Best

Martin

 



Types of Data Source on the LOD Cloud

2010-10-22 Thread Leigh Dodds
Hi,

The LOD cloud analysis [1] is a really great piece of work. I wanted
to pick up on one aspect of the analysis for further discussion:
whether data is published by the data owner or a third-party.

It seems to me that there are broadly three categories into which a
dataset might fall:

* Primary -- published and maintained directly by the data owner, e.g. BBC
* Secondary -- published and maintained by a third-party, e.g. by
scraping, wrapping or otherwise converting a data source
* Tertiary -- published and maintained by a third-party, usually a
mirror or aggregation of primary/secondary sources. This might be a
direct mirror, or involve some additional creativity, e.g.
re-modelling some aspects of another dataset. Mirrors typically
provide additional services, e.g. a SPARQL endpoint where primary
source doesn't provide one.

If we consider the different categories we can see that:

* Growth of the web of data is best served by encouraging more Primary
sources. The current community can't scale to add more Secondary
sources, so adoption is best driven by data owners

* Sustainability and usage of Linked Data is best served by
encouraging more Tertiary sources. Availability of useful, current
aggregations of data, wrapped in services will help drive more
consumption.

What do others think?

Cheers,

L.

[1]. http://www4.wiwiss.fu-berlin.de/lodcloud/state/

-- 
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.do...@talis.com
http://www.talis.com



Schema Mappings (was Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices)

2010-10-22 Thread Leigh Dodds
Hi,

On 22 October 2010 09:35, Chris Bizer ch...@bizer.de wrote:
 Anja has pointed to a wealth of openly
 available numbers (no pun intended), that have not been discussed at all.
 For
 example, only 7.5% of the data source provide a mapping of proprietary
 vocabulary terms to other vocabulary terms. For anyone building
 applications to work with LOD, this is a real problem.

 Yes, this is also the figure that scared me most.

This might be low for a good reason: people may be creating
proprietary terms because they don't feel well served by existing
vocabularies and hence defining mappings (or even just reusing terms)
may be difficult or even impossible.

This also strikes me as an opportunity: someone could usefully build a
service (perhaps built on facilities in Sindice) that aggregated
schema information and provides tools for expressing simple mappings
and equivalencies. It could fill a dual role: recommend more
common/preferred terms, whilst simultaneously providing
machine-readable equivalencies.

I know that Uberblic provides some mapping tools in this area,
allowing for the creation of a more normalized view across the web,
but not sure how much of that is resurfaced.

Cheers,

L.

-- 
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.do...@talis.com
http://www.talis.com



Concordance, Reconciliation, and shared identifiers

2010-10-22 Thread Leigh Dodds
Hi,

The announcement of that the Guardian has begun cataloguing other
identifiers (e.g. ISBN, Musicbrainz) within its API [1] is a nice
illustration that the importance of cross-linking between datasets is
starting to become more generally accepted. Setting aside the debate
about what constitutes linked data, I think its important that this
community tracks these various initiatives to help explore the
trade-offs between different approaches, as well as to build bridges
with the wider developer community.

A great project would be for someone to produce a Linked Data wrapper
for the Guardian API, that allows linking *in* to their data, based on
ISBNs and MusicBrainz ids. Its on my TODO list, but then so is a lot
of other stuff ;)

If we look back a few months we can see signs of the importance of
cross-linking appearing in other projects. Google Refine (nee Freebase
Gridworks) has the notion of a reconcilication service that is used
to build and set links [2]. Yahoo meanwhile have their concordance
service [3, 4] which is basically a sameAs.org service for building
cross-links between geo data.

Again, it would be interesting to build bridges between different
communities by showing how one can achieve the same effects with
Linked Data, as well as integrating Linked Data into those services by
providing gateways services, e.g. implementing the same API but backed
by RDF. This is what I did for the Gridworks, but the same could be
extended to other services.

Cheers,

L.

[1]. http://www.guardian.co.uk/open-platform/blog/linked-data-open-platform
[2]. 
http://www.ldodds.com/blog/2010/08/gridworks-reconciliation-api-implementation/
[3]. 
http://blog.programmableweb.com/2010/04/05/yahoos-new-geo-concordance-a-geographic-rosetta-stone/
[4]. 
http://developer.yahoo.com/geo/geoplanet/guide/api-reference.html#api-concordance

-- 
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.do...@talis.com
http://www.talis.com



Re: AW: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Enrico Franconi
I happen to agree with Martin here.
My concern is that the naïveté of most of the research in LOD creates the 
illusion that data integration is an easily solvable problem -- while it is 
well known that it is the most important open problem in the database community 
(30+ years of research) where there is a huge amount of money, research, and 
resources invested in it. This will eventually fire back to us - the whole 
community including me - since people will not trust us anymore.
Specifically, you can't deny that in practice the mythical picture gives this 
illusion; otherwise, why have it?
cheers
--e.

On 22 Oct 2010, at 16:49, Chris Bizer wrote:

 Hi Martin,
 
 The fact that there is obviously a lot of low quality data on the
 current Web should not encourage us to publish masses of low-quality
 data and then celebrate ourselves for having achieved a lot. The
 current Web tolerates buggy markup, broken links, and questionable
 content of all types. But I hope everybody agrees that the Web is
 successful because of this tolerance, not because of the buggy content
 itself. Quite to the contrary, the Web has been broadly adopted
 because of the lots of commonly agreed high-quality contents.
 
 Sure, where is the problem? 
 
 The same holds for the Web of Data: There is a lot of high quality content
 and a lot of low quality content.
 Which means - as on the classic Web - that the data consumer need to decide
 which content it wants to use.
 
 If the Web has proved anything than that having a completely open
 architecture is a crucial factor for being able to succeed on global scale. 
 The Web of Linked Data also aims at global scale. Thus, I will keep on
 betting on open solutions without curation or any other bottle neck. 
 
 If you continue to live the linked data landfill style it will fall
 back on you, reputation-wise, funding-wise, and career-wise. Some
 rules hold in ecosystems of all kinds and sizes.
 
 Sorry, you are leaving the grounds of scientific discussion here and I will
 thus not comment.
 
 Best,
 
 Chris
 
 
 Best
 
 Martin
 
 
 




Re: Correct Usage of rdfs:idDefinedBy in Vocabulary Specifications with a Hash-based URI Pattern

2010-10-22 Thread Pat Hayes

On Oct 21, 2010, at 10:05 AM, KangHao Lu (Kenny) wrote:

 Hello Martin,
 
 I don't think my argument would be very logical, but we can't wait for rule 
 engines to discuss this.
 
 Note, however, the majority of the Web vocabularies use the same URI for the 
 entity name reference and the descriptor reference, see the link provided by 
 Michael Hausenblas:
 
  http://code.google.com/p/void-impl/issues/detail?id=45
 
 and in particular the little survey by Richard Cyganiak posted on that page.
 
 I personally would argue that in the case of ontologies / vocabularies, the 
 conceptual difference between the entity and the descriptor is a lot less 
 significant than when it comes to data, since an ontology is, by definition, 
 a specification, i.e. a document.
 
 
 Basically I like this approach, that is, I don't like the fact that some 
 ontologies have '#' as end character and there should not a URI for an 
 ontology document and a different URI for the *conceptual* ontology.
 
 IIRC, 3 years ago Tim was very shocked by those ontologies that have '#' as 
 end charter and claimed that this is not a good idea (and he would bring up 
 this issue at TAG or awwaw, I can't remember). The argument was that string 
 after '#' has the meaning of 'local identifier' (so that we use #I #i for 
 WebIDs because 'I' is a 'local identifier') and identifiers can't be empty 
 strings (or this might break some systems, I guess). I somehow agree with 
 that, and Toby's use of my: to identify an Ontology makes me a little bit 
 uncomfortable. I have no idea if there's any followup after Tim brought this 
 to TAG or awwaw.
 
 I have another argument, namely, you should distinguish the concept from the 
 document only if the following criterion is satisfied.
 
 - if the time when the thing with hash URI is created and the time when the 
 document is created have *clear* difference
 
 So this holds for people, so people should not use document URIs. This holds 
 for organizations, cause you create the website of an organization maybe some 
 years after the organization is founded. 
 
 The problem is 'ontology'. I don't know whether you should call the structure 
 an ontology or it became an ontology once it is written down, but I don't 
 think the difference of the timing is very *clear*.

I agree its not as clear as the other cases, but an argument for making the 
distinction would be that the same ontology can be encoded by different 
documents. For example, an OWL/RDF ontology *is* an RDF graph, but that graph 
can be represented/encoded/choose your word ... in a variety of different 
documents with different syntax rules.  This is an old and familiar 
distinction, really, between type and token: one work called Moby Dick, many 
copies of it; one third letter of the alphabet, as many copies of the C 
character as you can shake a stick at, and so on.

 A similar example is when you want to give a URI to a python module. I would 
 not end it with '#' because I don't see why we need do distinguish the 
 'module document' from 'module'.

That case is much blurrier, I agree. But imagine an algorithm implemented once 
in Python and elsewhere in C++. (Really, this is possible.) Same algorithm, 
very different documents. Ontologies are (arguably) more like algorithms than 
pieces of code.

Pat Hayes

 A module is a kind of document, so is ontology. So, owl:Ontology 
 rdfs:subClassOf foaf:Document !
 
 Well, this is a theory. If there's a common practice of using '#'-ending URI 
 for ontologies, maybe we should accept it.
 
 No strong opinion. Wasn't this discussed at AWWAW? Just curious.
 
 Cheers,
 
 --
 Kenny
 WebID: http://dig.csail.mit.edu/People/kennyluck#I
 What is WebID: http://esw.w3.org/WebID


IHMC (850)434 8903 or (650)494 3973   
40 South Alcaniz St.   (850)202 4416   office
Pensacola(850)202 4440   fax
FL 32502  (850)291 0667   mobile
phayesAT-SIGNihmc.us   http://www.ihmc.us/users/phayes






Re: Low Quality Data (was before Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices)

2010-10-22 Thread Kingsley Idehen

On 10/22/10 10:47 AM, Juan Sequeda wrote:

Martin and all,

Can somebody point me to papers or maybe give their definition of low 
quality data when it comes to LOD. What is the criteria for data to be 
considered low quality.


My Subjective Data Quality Factors:

1. Unambiguous Names -- Resolvable URIs based Names
2. Data Representation Format Dexterity -- HTTP + Content Negotiation 
which loosens the coupling between model Semantics and Data Representation

3. Platform Agnostic Data Access -- HTTP delivers this well
4. Change Sensitivity -- speaks for itself, hopefully
5. Provenance -- data about the data (metadata) that helps establish 
Who, What, When, Where, and ~ Why re. curation

6. Mesh Navigability  -- inference context enables this ..

This is why I say: look at Data like a cube of sugar. Especially when 
trying to fashion Linked Data oriented business models. 1-6 nullify many 
of the concerns about data driven business models:


1. Wholesale Imports (crawls) that reconstitute data in a new data space 
-- #1 allows you to brand your data, when combined with licensing it 
also allows you track conformance (remember, Web Architecture makes the 
Web sticky via http logs amongst other things, so entropy is your 
friend, ultimately)


2. Attribution -- ditto

3. Data Consumer Identity -- WebID will put an end to API Keys (major 
relics) so QoS based on quality factors #2-6 is absolutely plausible.




Kingsley


Thanks

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com http://www.juansequeda.com


On Fri, Oct 22, 2010 at 9:01 AM, Martin Hepp 
martin.h...@ebusiness-unibw.org 
mailto:martin.h...@ebusiness-unibw.org wrote:


The Web of documents is an open system built on people
agreeing on standards
and best practices.
Open system means in this context that everybody can publish
content and
that there are no restrictions on the quality of the content.
This is in my opinion one of the central facts that made the
Web successful.

+100


The same is true for the Web of Data. There obviously cannot
be any
restrictions on what people can/should publish (including,
different
opinions on a topic, but also including pure SPAM). As on the
classic Web,
it is a job of the information/data consumer to figure out
which data it
wants to believe and use (definition of information quality =
usefulness of
information, which is a subjective thing).
+100


The fact that there is obviously a lot of low quality data on the
current Web should not encourage us to publish masses of
low-quality data and then celebrate ourselves for having achieved
a lot. The current Web tolerates buggy markup, broken links, and
questionable content of all types. But I hope everybody agrees
that the Web is successful because of this tolerance, not because
of the buggy content itself. Quite to the contrary, the Web has
been broadly adopted because of the lots of commonly agreed
high-quality contents.

If you continue to live the linked data landfill style it will
fall back on you, reputation-wise, funding-wise, and career-wise.
Some rules hold in ecosystems of all kinds and sizes.

Best

Martin





--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen







Re: Concordance, Reconciliation, and shared identifiers

2010-10-22 Thread Kingsley Idehen

On 10/22/10 11:47 AM, Leigh Dodds wrote:

Hi,

The announcement of that the Guardian has begun cataloguing other
identifiers (e.g. ISBN, Musicbrainz) within its API [1] is a nice
illustration that the importance of cross-linking between datasets is
starting to become more generally accepted. Setting aside the debate
about what constitutes linked data, I think its important that this
community tracks these various initiatives to help explore the
trade-offs between different approaches, as well as to build bridges
with the wider developer community.

A great project would be for someone to produce a Linked Data wrapper
for the Guardian API, that allows linking *in* to their data, based on
ISBNs and MusicBrainz ids. Its on my TODO list, but then so is a lot
of other stuff ;)


We've had sponger meta cartridges [1] for the Guardian API since its 
early incarnations.


Anyone that uses URIBurner [2] ends up with a look-up pass through 
Guardians API (amongst a boat load of others) en route to the final 
URIBurner generated Linked Data graph. URIBurner then pings PTSW which 
ultimately leads to data in the LOD Cloud Cache we maintain.


Sindice also does URIBurner lookups, and URIBurner also looks up Sindice 
(sames.org  etc..) .


End result, dynamic Web of Linked Data. We stopped counting its size in 
2007 :-)


Of course, others too should make wrappers for these APIs so that their 
perspectives are expressed in the burgeoning Web of Linked Data etc..




If we look back a few months we can see signs of the importance of
cross-linking appearing in other projects. Google Refine (nee Freebase
Gridworks) has the notion of a reconcilication service that is used
to build and set links [2]. Yahoo meanwhile have their concordance
service [3, 4] which is basically a sameAs.org service for building
cross-links between geo data.

Again, it would be interesting to build bridges between different
communities by showing how one can achieve the same effects with
Linked Data, as well as integrating Linked Data into those services by
providing gateways services, e.g. implementing the same API but backed
by RDF. This is what I did for the Gridworks, but the same could be
extended to other services.


On our part, we've been doing so since Linked Data inception, and will 
continue to do so :-)


Links:

1. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger -- 
Sponger

2. http://uriburner.com -- Virtuoso Sponger Service .


Cheers,

L.

[1]. http://www.guardian.co.uk/open-platform/blog/linked-data-open-platform
[2]. 
http://www.ldodds.com/blog/2010/08/gridworks-reconciliation-api-implementation/
[3]. 
http://blog.programmableweb.com/2010/04/05/yahoos-new-geo-concordance-a-geographic-rosetta-stone/
[4]. 
http://developer.yahoo.com/geo/geoplanet/guide/api-reference.html#api-concordance




--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen








Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices

2010-10-22 Thread Kingsley Idehen

On 10/21/10 11:56 PM, Martin Hepp wrote:

Hi all:

I think that Enrico really made two very important points:

1. The LOD bubbles diagram has very high visibility inside and outside 
of the community (up to the point that broad audiences believe the 
diagram would define relevance or quality).


True re. visibility.

Subjective quality bearer, I think not :-)



2. Its creators have a special responsibility (in particular as 
scientists) to maintain the diagram in a way that enhances insight and 
understanding, rather than conveying false facts and confusing people.


Methinks creators executed on a marketing plan. Personally, I enjoy the 
fact that people otherwise tagged as geeks have ended up demonstrated 
potent marketing prowess, really.




So Kingsley's argument that anybody could provide a better diagram 
does not really hold. 

Uh?

I said: more diagrams, each addressing a specific realm of interest (and 
bias) here are some examples:


1. http://linkedopencommerce.com -- you know about this one clearly
2. 
http://www.mkbergman.com/wp-content/themes/ai3/images/2009Posts/090212_lodd_cloud.jpg 

3. http://umbel.org/images/081010_lod_constellation.png -- UMBEL (TBox 
oriented)

4. http://www.mquter.qut.edu.au/bio/bio2rdf.jpg -- Bio2RDF
5. 
http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/ClickableVirtSpongerCloud/sponger-cloud.png 
-- Old Dynamic Linked Data Cloud via Sponger (URIBurner in current LOD 
cloud) .


It will harm the community as a whole, sooner or later, if the diagram 
misses the point, simply based on the popularity of this diagram.


How does a single diagram define a community? IMHO attacking those that 
have made contributions that have become popular says more about 
problems in our community.


We really have to make up our minds what we want here. Everyone is 
entitled to their own biases (context lenses) that's the fundamental 
beauty of the Web, especially the Linked Data variant that's rapidly 
taking shape.


Until the Web stops us from projecting our individual biases, I stand by 
my position i.e., let a million LOD cloud variants rain :-)


I strongly encourage you to make an alternative pictorial that addresses 
the areas that concern you. Of course, if their is language that may 
concern you re. the report from Chris, then that's a different matter 
re. potentially for deeming the LOD cloud pictorial as canonical.




And to be frank, despite other design decisions, it is really 
ridiculous that Chris justifies the inclusion of Denny's numbers 
dataset as valid Linked Data, because that dataset is, by design and 
known to everybody in the core community, not data but noise.
Do you really believe that most powerpoint viewers actually drill down 
this deep into the pictorial?


When people ask be about the cloud i.e., what does it signify etc. My 
answer goes like this: Linked Data exists on the Web, and that its on a 
discernible exponential curve (meaning critical mass to VCs and suits). 
I don't make any statements about subjective quality. If people ask: 
how is this going to affect business models? I show them the Linked Open 
Commerce cloud collection, and they get it, pronto!


Pronto! implying there's money to be made, but biz models remains fuzzy 
since Linked Data QoS factors haven't really crystallized due to basic 
Linked Data concept remaining mercurial to comprehend. Thus, you get 
your typical powerpoint effect: seed planted, hockey stick potential is 
sorta there, now let me go figure how to make this 
less-fuzzy-web-opportunity my own money making reality etc..





This is the linked data landfill mindset that I have kept on 
complaining about. You make it very easy for others to discard the 
idea of linked data as a whole.


Come on!

Seriously, where would Linked Data be without the LOD cloud pictorial 
re. mindshare acquisition?


Let's just make more realm / bias specific pictorials and and associated 
data set analysis reports so that newly uncovered Linked Data dimensions 
go viral like the LOD cloud. Thus, I can only agree with you once you've 
taken corrective action :-)







Best

Martin






--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen








Please allow JS access to Ontologies and LOD

2010-10-22 Thread Nathan

Hi All,

Currently nearly all the web of linked data is blocked from access via 
client side scripts (javascript) due to CORS [1] being implemented in 
the major browsers.


Whilst this is important for all data, there are many of you reading 
this who have it in your power to expose huge chunks of the RDF on the 
web to JS clients, if you manage any of the common ontologies or 
anything in the LOD cloud diagram, please do take a few minutes from 
your day to expose the single http header needed.


Long story short, to allow js clients to access our open data we need 
to add one small HTTP Response header which will allow HEAD/GET and POST 
requests - the header is:

  Access-Control-Allow-Origin *

This is both XMLHttpRequest (W3C) and XDomainRequest (Microsoft) 
compatible and supported by all the major browser vendors.


Instructions for common servers follow:

If you're on Apache then you can send this header by simply adding the 
following line to a .htaccess file in the dir you want to expose 
(probably site-root):

  Header add Access-Control-Allow-Origin *

For NGINX:
  add_header Access-Control-Allow-Origin *;
see: http://wiki.nginx.org/NginxHttpHeadersModule

For IIS see:
  http://technet.microsoft.com/en-us/library/cc753133(WS.10).aspx

In PHP you add the following line before any output has been sent from 
the server with:

  header(Access-Control-Allow-Origin, *);

For anything else you'll need to check the relevant docs I'm afraid.

Best  TIA,

Nathan

[1] http://dev.w3.org/2006/waf/access-control/



Re: [foaf-protocols] Please allow JS access to Ontologies and LOD

2010-10-22 Thread Melvin Carvalho
On 23 October 2010 01:04, Nathan nat...@webr3.org wrote:
 Hi All,

 Currently nearly all the web of linked data is blocked from access via
 client side scripts (javascript) due to CORS [1] being implemented in
 the major browsers.

 Whilst this is important for all data, there are many of you reading
 this who have it in your power to expose huge chunks of the RDF on the
 web to JS clients, if you manage any of the common ontologies or
 anything in the LOD cloud diagram, please do take a few minutes from
 your day to expose the single http header needed.

 Long story short, to allow js clients to access our open data we need
 to add one small HTTP Response header which will allow HEAD/GET and POST
 requests - the header is:
   Access-Control-Allow-Origin *

 This is both XMLHttpRequest (W3C) and XDomainRequest (Microsoft)
 compatible and supported by all the major browser vendors.

 Instructions for common servers follow:

 If you're on Apache then you can send this header by simply adding the
 following line to a .htaccess file in the dir you want to expose
 (probably site-root):
   Header add Access-Control-Allow-Origin *

 For NGINX:
   add_header Access-Control-Allow-Origin *;
 see: http://wiki.nginx.org/NginxHttpHeadersModule

 For IIS see:
   http://technet.microsoft.com/en-us/library/cc753133(WS.10).aspx

 In PHP you add the following line before any output has been sent from
 the server with:
   header(Access-Control-Allow-Origin, *);

 For anything else you'll need to check the relevant docs I'm afraid.

+1

Thanks for the heads up.  I added:

Header add Access-Control-Allow-Origin *

to my .htaccess and everything worked fine.  Easy!  :)


 Best  TIA,

 Nathan

 [1] http://dev.w3.org/2006/waf/access-control/
 ___
 foaf-protocols mailing list
 foaf-protoc...@lists.foaf-project.org
 http://lists.foaf-project.org/mailman/listinfo/foaf-protocols




[Fwd: XRD 1.0 currently up for OASIS Standard vote]

2010-10-22 Thread Nathan
FYI, you should probably be aware.. don't underestimate either, just 
take a look at the To: list..


 Original Message 
Subject: XRD 1.0 currently up for OASIS Standard vote
Date: Fri, 22 Oct 2010 16:25:53 -0700
From: Will Norris w...@willnorris.com
Reply-To: activity-stre...@googlegroups.com
To: activity-stre...@googlegroups.com, gene...@lists.openid.net, 
bo...@lists.openid.net, sp...@lists.openid.net, 
oa...@googlegroups.com, 	oexcha...@googlegroups.com, 
portableconta...@googlegroups.com, 	salmon-proto...@googlegroups.com, 
diso-proj...@googlegroups.com, 	webfin...@googlegroups.com


(apologies up front for those that get multiple copies of this due to the
wide cross post)

For those that haven't been following the current status XRD, I wanted to
ping you and let you know that it is currently being voted on for
consideration as an OASIS Standard.

As a quick recap, XRD is a simple format for describing resources, with one
immediate use being discovery of social web services.  It is a direct
evolution of XRDS (used for discovery in OpenID 2.0) and XRDS-Simple (used
in OAuth Discovery, Portable Contacts, et al).  XRD is currently in use as
the descriptor format powering WebFinger.  You can read the full spec at:
http://docs.oasis-open.org/xri/xrd/v1.0/xrd-1.0.html

I'm writing to these communities to encourage anyone whose company is an
OASIS member to find out who your OASIS representative is and encourage them
to vote.  The ballot is open through October 31, so we only have just over a
week left.

Ballot:
http://www.oasis-open.org/apps/org/workgroup/voting/ballot.php?id=1955(OASIS
login required)

Thanks,
Will Norris

--
You received this message because you are subscribed to the Google 
Groups Activity Streams group.

To post to this group, send email to activity-stre...@googlegroups.com.
To unsubscribe from this group, send email to 
activity-streams+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/activity-streams?hl=en.






Re: Please allow JS access to Ontologies and LOD

2010-10-22 Thread Ian Davis
Hi Nathan,

I implemented this header on http://productdb.org/ (since I had the
code open). Can someone comfirm that it does what's expected (i.e.
allows off-domain requesting of data from productdb.org)

One important thing to note. The PHP snippet you gave was slightly
wrong. The correct form is:

header(Access-Control-Allow-Origin: *);

Cheers,

Ian


On Sat, Oct 23, 2010 at 12:04 AM, Nathan nat...@webr3.org wrote:
 Hi All,

 Currently nearly all the web of linked data is blocked from access via
 client side scripts (javascript) due to CORS [1] being implemented in the
 major browsers.

 Whilst this is important for all data, there are many of you reading this
 who have it in your power to expose huge chunks of the RDF on the web to JS
 clients, if you manage any of the common ontologies or anything in the LOD
 cloud diagram, please do take a few minutes from your day to expose the
 single http header needed.

 Long story short, to allow js clients to access our open data we need to
 add one small HTTP Response header which will allow HEAD/GET and POST
 requests - the header is:
  Access-Control-Allow-Origin *

 This is both XMLHttpRequest (W3C) and XDomainRequest (Microsoft) compatible
 and supported by all the major browser vendors.

 Instructions for common servers follow:

 If you're on Apache then you can send this header by simply adding the
 following line to a .htaccess file in the dir you want to expose (probably
 site-root):
  Header add Access-Control-Allow-Origin *

 For NGINX:
  add_header Access-Control-Allow-Origin *;
 see: http://wiki.nginx.org/NginxHttpHeadersModule

 For IIS see:
  http://technet.microsoft.com/en-us/library/cc753133(WS.10).aspx

 In PHP you add the following line before any output has been sent from the
 server with:
  header(Access-Control-Allow-Origin, *);

 For anything else you'll need to check the relevant docs I'm afraid.

 Best  TIA,

 Nathan

 [1] http://dev.w3.org/2006/waf/access-control/





Re: Please allow JS access to Ontologies and LOD

2010-10-22 Thread Nathan

Hi Ian,

Thanks, I can confirm the change has been successful :)

However, one small note is that the conneg URIs such as 
http://productdb.org/gtin/00319980033520 do not expose the header, thus 
can't be used.


In order to test yourself, simply do a curl -I request on the resource, 
for instance:


 curl -I http://productdb.org/gtin/00319980033520.rdf

Also, I've just uploaded a small script which lets you enter a uri of an 
RDF/XML document, it'll try and pull it, parse it and display it as 
turtle for you - which is a good test of both CORS and the script ;)

  http://webr3.org/apps/play/api/test

FYI, Dan has also made the change so the FOAF vocab is now exposed to JS.

Best and thanks again,

Nathan

Ian Davis wrote:

Hi Nathan,

I implemented this header on http://productdb.org/ (since I had the
code open). Can someone comfirm that it does what's expected (i.e.
allows off-domain requesting of data from productdb.org)

One important thing to note. The PHP snippet you gave was slightly
wrong. The correct form is:

header(Access-Control-Allow-Origin: *);

Cheers,

Ian


On Sat, Oct 23, 2010 at 12:04 AM, Nathan nat...@webr3.org wrote:

Hi All,

Currently nearly all the web of linked data is blocked from access via
client side scripts (javascript) due to CORS [1] being implemented in the
major browsers.

Whilst this is important for all data, there are many of you reading this
who have it in your power to expose huge chunks of the RDF on the web to JS
clients, if you manage any of the common ontologies or anything in the LOD
cloud diagram, please do take a few minutes from your day to expose the
single http header needed.

Long story short, to allow js clients to access our open data we need to
add one small HTTP Response header which will allow HEAD/GET and POST
requests - the header is:
 Access-Control-Allow-Origin *

This is both XMLHttpRequest (W3C) and XDomainRequest (Microsoft) compatible
and supported by all the major browser vendors.

Instructions for common servers follow:

If you're on Apache then you can send this header by simply adding the
following line to a .htaccess file in the dir you want to expose (probably
site-root):
 Header add Access-Control-Allow-Origin *

For NGINX:
 add_header Access-Control-Allow-Origin *;
see: http://wiki.nginx.org/NginxHttpHeadersModule

For IIS see:
 http://technet.microsoft.com/en-us/library/cc753133(WS.10).aspx

In PHP you add the following line before any output has been sent from the
server with:
 header(Access-Control-Allow-Origin, *);

For anything else you'll need to check the relevant docs I'm afraid.

Best  TIA,

Nathan

[1] http://dev.w3.org/2006/waf/access-control/