Re: [Wikidata-l] DBpedia usage in the bbc

Michael Smethurst Wed, 04 Jul 2012 06:41:20 -0700

On 04/07/2012 10:48, "Denny Vrandečić" <denny.vrande...@wikimedia.de> wrote:

> Hello Michael,
> 
> thank you for your input, this is extremely valuable.
> 
> In general I expect that Wikidata will serve your needs better than an
> extraction from Wikipedia could. First, yes, we will have more stable
> identifiers. Second, it should be better at identifying items of
> interest. Some of the reasons why several meanings are conflated into
> one article or spread over several articles in Wikipedia is that it
> simply makes sense for a text encyclopedia. I don't see a reason for
> Wikidata doing the same.
> 
> I do not expect Wikidata to solve all problems. In some glorious
> future, Wikidata will have a community. This community will decide on
> criteria for inclusion, both with regards to the coverage of items and
> with regards to what they are saying about them. The community will
> decide on the kind of sources they accept. Etc.
> 
> (Actually, "decide" is too nice a word for the process I expect will unfold...
> )
> 
> We will keep the problems you mentioned in mind, and I fully think
> that we will improve on every single one of them.

Look forward to seeing it unfold :-)
> 
> 2012/7/3 Michael Smethurst <michael.smethu...@bbc.co.uk>:
> 
>> So I think we'd be interested in wikidata for 2 (maybe 3) reasons:
>> 1. as a source of data for domains where there's no established (open)
>> authority (eg the equivalent of musicbrainz for films)
>> 2. as a better, more stable source of identifiers to triangulate to other
>> data sources
> 
> Yes, I expect that both use cases will be covered by Wikidata.
> 
>> ?3?. Possibly as a place to contribute of some of our data (eg we're
>> donating our classical music data to musicbrainz; there may be data we have
>> that would be useful to wikidata)
> 
> It will be up to the community to accept data donations -- the
> development team does not speak for the community.

Yes, that goes for musicbrainz too. We can offer data but it's up to the
community whether or not they accept it

> Personally I would
> be thrilled to see such donations happen. See also:
> 
> <http://meta.wikimedia.org/wiki/Wikidata/FAQ#I_have_a_lot_of_data_to_contribut
> e._How_can_I_do_that.3F>
> 
>> Have glanced quickly at the proposed wikidata uri scheme
>> (http://meta.wikimedia.org/wiki/Wikidata/Notes/URI_scheme#Proposal_for_Wikid
>> ata) and
>> <snip>
>> http://{site}.wikidata.org/item/{Title} is a semi-persistent convenience URI
>> for the item about the article Title on the selected site
>> Semi-persistent refers to the fact that Wikipedia titles can change over
>> time, although this happens rarely
>> </snip>
>> Not sure on the definition of infrequently but I know it's caused us
>> problems.
> 
> Fully agree. But they make for nice looking URIs.

Aesthetic concerns about uris tend to make me shiver :-)

> The canonical URI
> though is the ID-based one, and these are stable. The pretty ones are
> for convenience only. I will take a look at the note to see if this
> needs to be made more explicit.

Think it is explicit. Just that there's so many flavours of URI knocking
about it feels a bit confusing. The separation of the human readable and the
machine readable feels like it's following the dbpedia design pattern and
conflating the NIR > IR step with the content negotiation which feels (to
me) like a mistake.

Have talked about this is the past on the LOD list so to save typing:
http://lists.w3.org/Archives/Public/public-lod/2012Mar/0337.html

Not sure putting /data in a URI is ever a good idea. Shouldn't whether you
want data or not be decided by your accept headers. Same for ?format=json
etc.

For reference we use hash uris for things but only reference those in rdf
and never link to them. One information resource uri gets exposed in links /
the browser bar and does content negotiation for format (and eventually
language) and the response comes with content location header of the IR URI
dot the_format



> 
>> Wondering if the id in http://wikidata.org/id/Q{id} is the wikipedia row ID
>> (as used by dbpedialite)? Also wondering why there's a different set of URIs
>> for machine-readable access rather than just using content negotiation?
> 
> No it is not. There is no such thing as the "wikipedia row ID", what
> you mean is the "page ID on the English Wikipedia".

Ah, ok. Think someone once said that was the id of the underlying database
row of the page record. Looking at dbpedialite it seems it does only support
en.wikipedia

> As there are
> plenty of items that have articles only in Wikipedia other than
> English, a reliance on the English Page ID would be problematic. We
> introduce new IDs for Wikidata, but we will provide mappings to page
> IDs in the different Wikipedia language editions.

Cool. Those mappings would be very useful for us. We're using Wikiminer (
https://secure.wikimedia.org/wikipedia/meta/wiki/WikiMiner) for entity
extraction on archive media which also returns the page ID so some systems
only know that ID. Be good to be able to query wikidata by it
> 
> Thank you again for your input, and I hope the answers help.

Yes, thanks
michael
> 
> Cheers,
> Denny
> 
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
                                        

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] DBpedia usage in the bbc

Reply via email to