Re: [Wikidata] frequency of qualifier predicates

2018-07-16 Thread Vladimir Alexiev
The list of props (4.9k) is returned quickly enough. Unfortunately it
includes all props: each one has a wikibase:qualifier "just in case"

select * {
  ?p wikibase:qualifier ?pq
}

It is a pity that this one times out, since the filter merely needs to
look for 1 statement instance, 4.9k times:

select * {
  ?p wikibase:qualifier ?pq
  filter exists {?x ?pq ?y}
} limit 100

What query did you try?

On Sat, Jul 14, 2018 at 2:40 AM, Peter F. Patel-Schneider
 wrote:
> I'm trying to get a good estimate of how often which qualifier predicate is 
> used.
>
>
> The obvious query times out, as expected, so I was trying to find a list of
> predicates that are used as qualifiers so that I can craft a query for each of
> them.  There is
> https://www.wikidata.org/wiki/Wikidata:List_of_properties/Wikidata_qualifier
> but that can't be trusted as it doesn't include start time (P580) or end time
> (P582) which I expect to be the most common qualifier predicates.
>
>
> So, I'm stumped.   Any suggestions?
>
>
> peter
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
Vladimir Alexiev, PhD, PMP
Lead, Data and Ontology Management
Ontotext Corp, www.ontotext.com
Email: vladimir.alex...@ontotext.com, skype:valexiev1
Mobile: +359 888 568 132, SMS: 359888568...@sms.mtel.net
Calendar: 
https://www.google.com/calendar/embed?src=vladimir.alex...@ontotext.com
Publications: http://vladimiralexiev.github.io/pubs/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] How can I find out all food & cooking related articles

2018-06-30 Thread Vladimir Alexiev
>
>  Dbpedia, Vladimir, is it possible to query if a wikipedia article is a
> particular category(s) or not?
>
> Sure. If I remember correctly, this is dct:subject. skos:broader makes the
> cat hierarchy, but to filter out irrelevant cars, use our data
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] How can I find out all food & cooking related articles

2018-06-22 Thread Vladimir Alexiev
Your best bet is to use the Wikipedia Categories. They are not available on
Wikidata, but are available on DBpedia.
We did a lot of work on this in the Europeana Food and Drink project.
We got a "FD classification" based on a filtered set of Wikipedia
categories:
http://efd.ontotext.com/data/#sec-3
You could find a bit more details (including presentations) in
http://vladimiralexiev.github.io/pubs/

Good luck!
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] RDF Dictionary / Thesaurus

2017-07-09 Thread Vladimir Alexiev
There is an rdf representation of wiktionary. Search for "linguistic linked
data" and LEMON

On Jul 9, 2017 11:31, "Amirouche"  wrote:

>
>
> Le 09/07/2017 à 08:53, Timothy Holborn a écrit :
>
>> Hi Peter,
>>
>> Awesome.  Yes.  this is the sort of thing i was looking to leverage.  I
>> couldn't find the RDF output for wordnet.  FWIW: i find this useful
>> http://osds.openlinksw.com/
>>
>> Still v.interested to understand how we might further enhance what exists
>> in Wiki style; mind, such a project is too much for me to take-on alone.
>>
>
> I think that's the purpose of the 'wiktionary in wikidata' project. I
> started working on importing data from wiktionaries but stopped for some
> reasons.. Otherwise said, it's not readily available.
>
>
>> thank you.  i'll put the reference to immediate use ;)
>>
>> Tim.
>>
>> On Sun, 9 Jul 2017 at 16:46 Peter Haase > p...@metaphacts.com>> wrote:
>>
>> Hi Timothy,
>>
>> have you looked at WordNet and its RDF version?
>> http://wordnet.princeton.edu
>> http://wordnet-rdf.princeton.edu
>>
>> Here is your example “identity”:
>> http://wordnet-rdf.princeton.edu/wn31/identity-n
>>
>> Cheers,
>> Peter
>>
>> On 9. Jul 2017, at 06:18, Timothy Holborn
>>> mailto:timothy.holb...@gmail.com>>
>>> wrote:
>>>
>>> Hi,
>>>
>>> I was working on the term 'identity' with respect to internet
>>> stuff; and thereafter started looking for an RDF source for an
>>> english thesaurus or dictionary; and couldn't find one. I found
>>> https://en.wiktionary.org/wiki/Wiktionary:Main_Page but it didn't
>>> seem to have well-formed RDF output; as to act as an ontological
>>> source (rather than simply the use of RDF for SEO).
>>>
>>> thereafter started writing; this is where i got up to,
>>>
>>> Project Purpose
>>> To generate an RDF compliant dictionary and thesaurus for the
>>> purpose of ontological reuse on the web.
>>>
>>> PROBLEM
>>> We use language to develop web-pages that have inferred human
>>> considered meaning. Yet, the definition of these terms are not
>>> necessarily machine readable.
>>>
>>> For Example:  "identity".
>>>
>>> When working on 'digital identity' this is often considered to
>>> have the meaning of how people log-in to their personal accounts
>>> or means in which to interact with their personal data; or that
>>> of others.   HOWEVER, identity can also mean 'sameness'; which
>>> can also be useful for organisations such as website operators to
>>> say 'these people have one of my website identities' that is to
>>> say, they're all consumers.
>>>
>>> http://www.dictionary.com/browse/identity
>>>
>>> This can be further clarified by looking at the different
>>> meanings provided to the same word via a thesaurus:
>>> http://www.thesaurus.com/browse/identity
>>>
>>> I thereafter looked for a way in which a statement of exactness
>>> could be made via RDF; but couldn't find an appropriate RDF
>>> dictionary resource.
>>>
>>> SOLUTION
>>> Build an online dictionary and thesaurus that is
>>> machine-readable.  It makes sense that this may best be done with
>>> wiki technology.
>>>
>>> FEATURES
>>> - The project would firstly focus on the lexicography of the
>>> english language and related dialects. This is expected to
>>> include works in adding latin predicates.
>>> - The project would produce a comprehensive thesaurus, including
>>> unique identifiers for different uses of the same term
>>> (supporting a comprehension of the differentiation in the use of
>>> that term).
>>> - The project would produce a platform that provided RDF output
>>> in a number of serialisations.
>>> - Would provide the means for people to add / edit content on the
>>> site.
>>>
>>> PRODUCTION METHOD
>>> It is hoped the site can be rapidly populated using scripts to
>>> ingest existing information from freely available sources; and to
>>> populate the system with information in an RDF compliant format;
>>> that may be altered, edited, updated in a ‘wiki’ like fashion.
>>>
>>> USES
>>> For the communication of specific concepts in a manner that may
>>> be further clarified by both human and machine observers; as to
>>> ensure parties are communicating and/or developing works upon a
>>> basis of common understanding of the meaning provided to the
>>> language used.
>>>
>>> I had concerns that the WikiData site seemed to be better
>>> orientated towards the concept of schema.org/thing
>>>   rather than a 'language' or other form
>>> of predicate. Please let me know your thoughts? Perhaps i've
>>> missed something entirely and this exists already?  Perhaps
>>> people have been thinking about it elsewhere?  perhaps barriers
>>> exist, that i'm not aware of...
>>>
>>> Timothy Holborn.

Re: [Wikidata] [wikicite-discuss] Re: Wikipedia Task Lists for Editathons using Wikidata

2017-03-22 Thread Vladimir Alexiev
>> Listeria makes auto-updating lists on pages, based on a SPARQL query. 
>> The question is whether you have a common characteristic to catch all your 
>> items, since eg Category is not it (no such prop on WD).
> I'm confused about the example that the category is not a property of 
> Wikidata. Is it not a query-able property in SPARQL to generate this type of 
> output? 

Right: there’s no property Category.  See these discussions
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Archive/30#category
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Archive/30#useful_for 

> After creating the Wikidata item for Visual artists of the African diaspora, 
> I started adding that category to artist Wikidata items –
> - as well as a Commons category if they had media.
> So if I run a SPARQL query using category it won't generate results?

Did you add it as “item’s main category”? That’s incorrect, since that’s 
inverse of “category’s main item”, 
and a category is supposed to have a single main item (e.g. page France vs 
category France).

> I'm confused because the other suggestion was to tag items as of interest to 
> Black Lunch Table. The category seems to be functioning in the same way, not 
> very different. 

How did you tag “of interest to”?

> One other question: is the task list on Listeria usable on Wikipedia pages, 
> or does it need to live in Wikidata's area?

You can put it on any wiki page, eg a Wikidata discussion or project page.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikipedia Task Lists for Editathons using Wikidata

2017-02-08 Thread Vladimir Alexiev
Hi Sandra/Spinster!

I support your request in the Wishlist Survey 
https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Survey/Categories/Programs_and_events#Article_tracking_tool_for_Wikiprojects.2C_edit-a-thons_and_other_campaigns.2C_based_on_Wikidata.
 

However, it's 87th in the list and no ticket assigned yet, so I don't know 
how long it'd take to implement.

I proposed a simple property Category and/or “Useful for / interesting to” 
to allow tracking ot items by projects, but it was rejected: 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Archive/30#category. 
Should we reopen this discussion?
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [wikicite-discuss] Re: Wikipedia Task Lists for Editathons using Wikidata

2017-02-08 Thread Vladimir Alexiev
Listeria makes auto-updating lists on pages, based on a SPARQL query. 
The question is whether you have a common characteristic to catch all your 
items, since eg Category is not it (no such prop on WD).
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [wikicite-discuss] Re: Wikipedia Task Lists for Editathons using Wikidata

2017-02-08 Thread Vladimir Alexiev
> Is there another property that could be substituted instead on Wikidata?

Unfortunately no.

> the categories seemed so logical. 

It IS logical to use an existing category for this purpose.

But the WD community has rejected my proposal 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Archive/30#category 
(and the previous one “useful for / interesting to”) because:
- categories are a bit messy, since many people use them for various 
purposes
- synchronizing cat-item assignments between WD and Wikipedias would be a 
major hassle.

They even rejected having such property that is NOT synced with Wikipedias, 
for project purposes like 
yours (Black Lunch Table) and mine (Europeana Food and Drink, European 
Holocaust Research Infrastructure).

WD does have a bunch of over-specific categories, e.g. “people 
born/died/buried here”. See
https://tools.wmflabs.org/sqid/#/browse?activepage=1&type=properties&lang=en&propertylabelfilter=category
 

Does anyone think we should reopen the discussion for creating props 
Category and/or InterestingTo?

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Using WD categories to mark items

2017-02-07 Thread Vladimir Alexiev
> Is there another property that could be substituted instead on Wikidata?

 

Unfortunately no.

 

> the categories seemed so logical. 

 

It IS logical to use an existing category for this purpose.

 

But the WD community has rejected my proposal 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Archive/30#category 
(and the previous one “useful for / interesting to”) because:

- categories are a bit messy, since many people use them for various purposes

- synchronizing cat-item assignments between WD and Wikipedias would be a major 
hassle.

 

They even rejected having such property that is NOT synced with Wikipedias, for 
project purposes like 

yours (Black Lunch Table) and mine (Europeana Food and Drink, European 
Holocaust Research Infrastructure).

 

WD does have a bunch of over-specific categoreies, e.g. “people 
born/died/buried here”. See

https://tools.wmflabs.org/sqid/#/browse?activepage=1 

 &type=properties&lang=en&propertylabelfilter=category 

 

Does anyone think we should reopen this discussion?

 

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] whitepaper on the Belgian museums' Wikidata project

2015-11-18 Thread Vladimir Alexiev
Let’s retweet it! https://twitter.com/valexiev1/status/666994200307752960

Manuel Palomo Duarte> Nice work, I'll reference it in my lectures.

If they're in English, where are they?


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Naming Conventions for URIs

2015-09-04 Thread Vladimir Alexiev
Paul Houle> The production for a QName  cannot begin with a number

I think you're reading some very old specs.

David Booth> this particular issue has 
> been fixed in SPARQL 1.1 and Turtle 1.1, last I checked not all tools 
> had been upgraded to those specs

I think all the important tools have been upgraded a long time ago.
http://vocab.getty.edu uses numeric local-names for 2 years now, and I haven't 
heard any complaints.
Dotted local-names used to be a problem (since dot is "end of pattern" in 
Turtle and SPARQL) 3-4 years ago, but I think the tools have fixed that as well.

You still can't use slash & hash in local names though (afaik CURIEs can do 
that), so instead we use dash. Eg
   aat_source:18469-subject-43956
is aat_source:18469 as applied to "subject" (concept) aat:43956

Austin William Wright > @base 
> doesn't really make any sense to have the trailing "I" on the @base,
> since that's going to get wiped out during the URI Reference resolution 
> process,

I think this is false.

Cheers!


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Descriptions

2015-08-20 Thread Vladimir Alexiev
> The case is made often that descriptions as they exist are evil. They are 
> atrocious 
> Why do we not get rid of all that rubbish. [and replace with]
> Automated descriptions … can easily be improved upon in two ways ..

I agree in general, except for items that don’t have much data, e.g. person’s 
life years,
(Or have too much data that can’t be selected easily, e.g. 10 occupations but 
only 1 is really notable).
For people: I mostly copy the description from Getty ULAN: that’s very good, 
even if the life years are unknown (thus set too wide, or missing).

So my point is, there should also be an algorithm to decide whether to replace 
the manual description.

Why people invest time in writing “rubbish”: because there’s no worse 
description than a missing description. 
Most everything should have an EN description, to allow a user to understand 
what that is, esp in an auto-complete list.
Even a very bad description usually allows that.
 


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] how to get WD item -> enwiki sitelinks?

2015-08-07 Thread Vladimir Alexiev
> DBpedia dumps are based on Wikidata dumps from March

Here is the drift from DBP from Mar to current WD:
  5882410 WDid-DBP.ttl
  6255943 WDid-WD.ttl
   763951 differences  12.7%
   195209 removed lines
   568742 added lines
If anyone cares, I can put up the diff file somewhere.

"Removals" include
- redirects, like !Bang! -> Funking_Conservatory
- renaming, e.g. $O$ -> $O$_(Die_Antwoord_album)
  Or "It"_–_The_Album -> "It"_the_Album
- renaming by removing qualifier, e.g. Centrify_(software) -> Centrify
- changing Q number, e.g. "Babbacombe"_Lee Q4540274 -> Q1401456

Looking at the file confirms what Dimitris implied: changes are the result of 
normal editorial actions.
Both files include Template: and Category: pages.

But 12.7% changes of page titles or Q numbers in 3-4 months is a lot!



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] how to get WD item -> enwiki sitelinks?

2015-08-07 Thread Vladimir Alexiev
> DBpedia dumps are based on Wikidata dumps from March
> If you care to give it a try you can try running the extraction framework 
> with "wikidata" and use only the "WikidataSameAsExtractor" extractor.

We’ll do that, and I'll diff Magnus' output to check for drift between WD and 
DBP.
There shouldn't be any since all lang links are sourced from WD, but just in 
case...

On the other hand, there is drift between WD items and WP articles.
Some of it "legitimate" (e.g. I added item for Europeana Food and Drink project 
but I wouldn't dare write an article).
Afaik, the Duplicity tool helps people work through this drift.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] how to get WD item -> enwiki sitelinks?

2015-08-07 Thread Vladimir Alexiev
> http://tools.wmflabs.org/wikidata-todo/static/item2enwiki.20150805.gz
> http://dumps.wikimedia.org/other/wikidata/

Thanks everyone for being so helpful!!


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] how to get WD item -> enwiki sitelinks?

2015-08-05 Thread Vladimir Alexiev
Hi folks!

We need a full mapping of WD item -> enwiki sitelinks.

1. We extracted from Dbpedia 2015-04 all statements of the form
  
 
And the count is 5882410 

2. Checked with WDQ:
https://wdq.wmflabs.org/api?q=link[enwiki]&noitems=1
"items":6263098
6.08% are missing from DBpedia. That's a lot

How to get them from Wikidata?

3. WDQ doesn't seem to return sitelinks.
https://wdq.wmflabs.org/api?q=link[enwiki]&props=enwiki returns just item
numbers

4. The SPARQL endpoint doesn't seem to have them:
http://wdqs-beta.wmflabs.org/ 

prefix schema: 
select * {?x schema:about ?y} 

returns nothing.
https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_d
ifferences
says "5. Depending on the instance of the service, multi-language labels and
sitelinks may or may not be supported."
I think this service doesn't have sitelinks: is there one that has them?



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata