from:"Denny Vrandečić"

[Wikidata] Re: [Wikitech-l] Re: [BREAKING CHANGE ANNOUNCEMENT] Wikidata Query Service graph split available in production; scholarly entity queries require migration by March 2025

2024-10-02 Thread Denny Vrandečić

The most promising show of "we are ready" for the Virtuoso Open Source
edition would be what QLever has been doing for a while: to provide a
public endpoint with the data loaded, and kept up to date using the public
edit stream. That would be an undeniably strong argument for "just use
this!"


On Thu, Sep 26, 2024 at 4:32 PM Samuel Klein  wrote:

>
> An updated benchmark eval (or self-evals) for the top db candidates seems
> called for :)  we would all love to see it.
>
> Ideally with a canonical hardware or vm spec that all can use...
>
> I don't know if the QLever self-eval from the spring is the right place to
> start, but I believe these
>  are the
> ~300 queries they used for a Wikidata benchmark.
>
> 🌍🌏🌎🌑
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/YJOUPRZPOIS6QMAPUJVYM65IQOM3DKIP/
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/E4POSDWVJ27GSOFTRWDKS3HAQOG6FO3Y/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: [Wikitech-l] Re: [BREAKING CHANGE ANNOUNCEMENT] Wikidata Query Service graph split available in production; scholarly entity queries require migration by March 2025

2024-09-24 Thread Denny Vrandečić

If my memory serves me well, the Open Source version of Virtuoso doesn't
have certain scalability features that would be necessary to run a graph as
large and dynamic as Wikidata's. Is this information out-of-date?

Cheers,
Denny

On Tue, Sep 24, 2024 at 10:35 PM Kingsley Idehen via Wikidata <
wikidata@lists.wikimedia.org> wrote:

> Hi Everyone,
> On 9/6/24 11:46 AM, Samuel Klein wrote:
>
> On Fri, Sep 6, 2024 at 4:52 AM Luca Martinelli [Sannita@WMF] <
> sann...@wikimedia.org> wrote:
>
>> no "magic solution" exists, each comes with its load of problems and
>> costs
>
>
> Given the reload speed, approach to more continuous updating
> ,
> and recent performance
> 
>  benchmarks
> 
>  from
> the page you referenced, QLever seems pretty magical.  [it was less so when
> the initial evaluation of backend alternatives came out]  It's also cheap
> enough to run at home that some people are scratching their own itch now
> when they have queries that time out on WDQS, as Peter highlights.
>
> Iterating on that benchmark until no one has any concerns with
> its applicability to our use case seems like a short-term high-return
> investment.SJ
>
>
> Is there a place where SPARQL URLs for the various benchmarks are
> collated? SPARQL makes transparency dead easy via SPARQL URLs.
>
> I am also very interested in what "numerous hurdles" is supposed to imply
> regarding Virtuoso when its installation boils downs to;
>
> 1. Run an installer (you don't even need to build its Open Source Edition
> binary).
>
> 2. Start server
>
> 3. Start interacting with SPARQL via the instance endpoint.
>
> --
> Regards,
>
> Kingsley Idehen   
> Founder & CEO
> OpenLink Software
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/2OLO2COIRW5XAVRQRX7VI4ODKTM72X3K/
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/OW6ERDNO6DENMUPAQLTAJKKB3PSFZ7WG/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: web reference

2023-07-28 Thread Denny Vrandečić

After reading both threads, my understanding is that Dr Lemaire does not
want to have the second last name attached to her identity.

The problem is not, given my reading, that she wants everything deleted. It
is fine if she does. But she certainly does not want the other last name in
any combination to appear with her name.

I acted accordingly and removed the other last name. As can be seen in Dr
Lemaire's papers, she has consistently used the name "Marie-Claude Lemaire"
and there is no reason not to respect that request.

What I didn't look into is the situation about the award and whether she is
a nuclear physicist or a particle physicist. Since there is no other person
in Wikidata with the name "Marie-Claude Lemaire", a simple physicist is
sufficient too.

Please everyone, remember to be kind, in particular when a person in
visible distress is reaching out to the community. I am not going to list
the instances of unkindness in our response to Dr Lemaire's plea, but I
think they are easy to find. It is easy the case for a person who is not
part of our community to misuse terminology, or not to clearly express
herself. It is our job to be extra kind in these cases, and try our best to
help.

I hope that the page now is in a state Dr Lemaire is content with.

I want to extend her an apology, although I am not sure she will read this
message.

Best regards,
Denny Vrandečić

On Fri, Jul 28, 2023 at 1:41 PM Vi to  wrote:

> This mailing list is public, so I suggest everybody to do not forget that
> public mailing lists shouldn't be used to deal with sensitive matters.
>
> Vito
>
> Il giorno ven 28 lug 2023 alle ore 22:09 Marie-Claude Lemaire <
> mclemai...@free.fr> ha scritto:
>
>> No I do not want it to be restored. I fight all my carrier to be only
>> Marie-Claude Lemaire. Now I am retired since 18 years at it very surprising
>> that after 18 years of retirement Marie-Claude Mallet-Lemaire reappears. I
>> am followed by a Physician for my psychological problems. I do not want to
>> be obliged to increase the number of pills I am obliged to swallow. I have
>> been hospitalized several time for this  stress .
>>
>> > Le 28 juil. 2023 à 21:31, Yaroslav Blanter  a écrit :
>> >
>> > If this lady wants her name updated on Google, she would be much better
>> > off if the page is kept, with her preferred name.
>> >
>> > That way Google will eventually note the page has been updated, and
>> > update their records.
>> >
>> > If the page is deleted, Google will never get that trigger, and the
>> name
>> > she doesn't like presumably will *never* be updated over on Google.
>> >
>> > Therefore Yaroslav I suggest you restore the page.
>> >
>> > Best regards,
>> >
>> >James
>> >
>> > If this is indeed the case this could be a good reason to restore the
>> page at least temporarily. Why do not we wait for a couple of days and see
>> what happens. (Again, any administrator may restore the page but this
>> likely will have consequences - not from my side).
>> >
>> > Best
>> > Yaroslav
>> > ___
>> > Wikidata mailing list -- wikidata@lists.wikimedia.org
>> > Public archives at
>> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/ZHGTJINE2KPGNIRDKXWGGKXLAYKECPL3/
>> > To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>> ___
>> Wikidata mailing list -- wikidata@lists.wikimedia.org
>> Public archives at
>> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/HXNR5UK57JGATXA2FIBLBI3YM5RWOCS3/
>> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>>
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/VOYZEZT2SEBQPQM7CITSKBOGKZOSRL2Y/
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/EB5LUFXEDWPZ3PGJUA3ZMEJXD2KZX4C6/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: A Wikidata App?

2022-02-03 Thread Denny Vrandečić

I think it would be great to have a mobile app for Wikidata.

On Thu, Feb 3, 2022 at 10:59 AM geislemx 
wrote:

> Hey all,
>
> I hope this mail finds you well in this trying times.
> Over the last month I invested some time and put a little project
> together for personal purpose. Long story short it is a small
> Wikidata/Wikibase App for Android.
>
> Currently it has similar capabilities to Termbox of the Webfrontend -
> add a new Item, searching for an Item, editing an Item, etc.
> The App is build in manner that it should be relatively easy to adapt it
> for any instances of Wikibase.
> Everything is written in Kotlin[1] and the data layer is forged in a way
> to facilitate Kotlin Multiplatform[2] and after turning some screws to
> support a given platform. This means writing an App for iOS with the
> similar capabilities or a Desktop App would be much easier. For myself I
> do not have a stable access to a Mac, otherwise I would made an iOS App
> as well right away.
>
> However the user journey is still bumpy, since parts are missing like
> loading bars and so on. Also the App only allows users to proceed
> without login. Also multi-language support of the UI is missing and the
> design needs a lot more love (like bring up a proper color system for
> theming). Summa summarum please consider it as an early alpha or an very
> very late prototype.
>
> I made a little screen cast which you can find in the endnotes[3]. If
> you wish to try it out yourself, you can go ahead[4][5]. But please be
> aware even if the packages are tightly tested, I have currently only one
> real device to test it on and emulators do not always tell the truth, so
> expect app crashes (even there should be none). Also the App is
> currently hooked against the test instance of Wikidata[6], so anything
> you do will not propagated to main instance of Wikidata and you will not
> be able to retrieve data from there.
>
> So why I am writing? Well, I would like to know, if the community has
> interest in such a thing - an App for Wikidata/Wikibase. Anything else
> can wait until this question is answered, since it will cost time to
> bring it into a publishable state. Please consider this would give the
> opportunity the obtain access to the sensors of a mobile device like
> geolocation, camera, etc and make them usable by Wikibase/Wikidata.
> If the community sees the value in this little project, I also like to
> ask for support/if there are people who are willing to embark on this
> with me.
> I hope you have a pleasant rest of the day.
>
> Cheers
>
>
> Matthias
>
>
> Endnotes:
> [1]: https://kotlinlang.org/
> [2]: https://kotlinlang.org/docs/multiplatform.html
> [3]: https://box.hu-berlin.de/d/66b0055734c3485b8d22/
> [4]: https://box.hu-berlin.de/f/45c45774cff14c5fad55/?dl=1
> [5]: https://www.javatpoint.com/how-to-install-apk-on-android
> [6]: https://test.wikidata.org/
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Licensing discussion for Abstract Wikipedia and Wikifunctions

2021-12-06 Thread Denny Vrandečić

Hi all,

We are currently looking for input on the question: what licensing
structure we should apply to Abstract Wikipedia and Wikifunctions.

After some initial discussion, the two following question in particular
remain open and would benefit from your input or vote:

1) Should Abstract Content for Abstract Wikipedia be published under CC 0
or CC BY-SA (or is either fine)?

2) Should code in Wikifunctions be published under Apache or the GPL (or is
either fine)?

Your input would be very much welcome!

https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Licensing_discussion

Thank you,
Denny
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: History of some original Wikidata design decisions?

2021-07-22 Thread Denny Vrandečić

Hi Thad,

Thanks for asking the questions, and thanks Tobi for the pointers. Man,
what a lengthy post it was.

I understand that the post answered most of your questions. I think that it
is entirely possible to layer a prototype semantics over Wikidata, just as
the DL semantics have been layered over it. I don't remember if such work
has been done before.

Regarding ISO 5964, I think I probably have looked through it at some
point, but I don't remember it anymore. SKOS has certainly been a stronger
influence, and obviously OWL.

I hope that helps with the historical deep dive :) Lydia and I really
should write that book!

Cheers,
Denny



On Sat, Jul 10, 2021 at 3:00 PM Thad Guidry  wrote:

> *Tobi - *That blog post 3 is very helpful.  It shows that Denny and I
> think alike and agree on everything. :-)  His dislike for strong
> classification.
> Which is part of my basis, to allow weak relations much more.  And use
> them.  But how to allow them, and I think the only way is through
> properties based on the Data Model currently.
> There are many ways, and SKOS is one way to allow expressing weak
> relations and we already have some good support with existing properties
> like P4390 mapping relation type 
> and a host of others.
>
> Denny and I also fear the same things, like not having a flexible enough
> system to describe our complex world that doesn't always fit into strict
> rules.  Which is kinda why I've always liked
> https://www.w3.org/TR/skos-primer/#secassociative
> because of it's non-transitivity which allows much flexibility and as he
> and I would say... avoid "Barbara". :-)
> Which is pretty much summarized in
> https://www.w3.org/TR/skos-primer/#secadvanced
>
> Sorry for all the SKOS links but semantic relations helps to describe
> human knowledge.  How a system represents or portrays semantic relations is
> where choices are made or have been made.  *And I think the right choices
> were definitely made.*
> Overlaying SKOS and the Wikidata properties that sprinkle it into the data
> model is useful, but I've always been kind of reluctant to do
> that...probably for the same reasons Denny might give?  Choices between
> allowing "semantic accuracy" versus "semantic flexibility".  But I think
> systems like SKOS provide both.  Perhaps it could be argued that OWL
> provides much less. :-)  Still all KOSs provide great use when they fit
> well.  How they can fit over Wikidata, as I said, is probably only through
> properties at this late stage of design and that's fine with me!
>
> Still, my main focus is and always will be trying to add human knowledge
> about concept relations into Wikidata to help machines, to help us.  (the
> "edges" that humans quickly can deduce in seconds, but still to this day
> can sometimes take machines days or weeks to figure out).
>
> My usage and help to Abstract Wikipedia and Wikidata later on will
> primarily be around the mapping of relations ... where a lot of the
> possibilities have already been described years and years ago at the very
> bottom of this long page:
> *inter-KOS mapping relationships  <-- *very last row, 3rd column
> https://www.w3.org/TR/skos-primer/#seccorrespondencesISO
>
>
> *Denny - * were you part of or lightly influenced by ISO 5964 through
> Germany ISO DIN or not .. that also would be good to know.
>
> Thad
> https://www.linkedin.com/in/thadguidry/
> https://calendly.com/thadguidry/
>
>
> On Sat, Jul 10, 2021 at 3:17 PM Tobi Gritschacher <
> tobias.gritschac...@wikimedia.de> wrote:
>
>> Hi,
>>
>> It would be nice to have a place to look with a link to a page in the
>>> Community portal that says "History of Wikidata's design and early
>>> collected meetings, notes, design documents, recordings"
>>>
>>
>> Might not answer your concrete question, but here are some (very) early
>> blog posts by Denny. They are still a nice read. :)
>>
>> 1/3
>> https://blog.wikimedia.de/2013/02/22/restricting-the-world/
>>
>> 2/3
>> https://newwwblog.wikimedia.de/2013/06/04/on-truths-and-lies/
>>
>> 3/3
>> https://blog.wikimedia.de/2013/09/12/a-categorical-imperative/
>>
>> Cheers, Tobi
>> ___
>> Wikidata mailing list -- wikidata@lists.wikimedia.org
>> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>>
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Focus languages for improvements to the lexicographic extension of Wikidata and Abstract Wikipedia

2021-03-03 Thread Denny Vrandečić

The on-wiki version of this is here:
https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Focus_languages

Hello all,

The Wikidata team at Wikimedia Deutschland will be working on improvements
to the lexicographic data part of Wikidata during this year. The Abstract
Wikipedia team at the Wikimedia Foundation will be working on the
generation of natural language text for baseline Wikipedia articles in the
next few years, and on functions in Wikifunctions to work with
lexicographic data. For these cases, it would be beneficial to focus on a
small specific set of languages at first. Participating communities will
hopefully find that this project leads to long-term growth in Wikipedia and
Wiktionary in and about their language.

Lydia and Denny would like to choose the same focus languages for both of
the teams, as this is beneficial for both projects to have this aligned.

We will be working closely together with the focus communities over the
next few years. This means that features will land first in these languages
and we will have particularly active feedback channels. We are looking for
communities that are open to trying out new things.

The decision of which languages should be the focus languages should be
done together with the wider communities. In particular, we would like to
make the decision with a promising self-selecting community. This worked
very well for Wikidata, where the focus projects were self-selected.

We will use English as a demonstration language and two or three other
languages as focus languages. English is chosen as it is easy to
demonstrate to a wide audience and is a working language for both
development teams.

For the focus languages, we want to work with an active and enthusiastic
community or seed of a community over the next few years on these projects.

In order to be fully transparent, we have compiled a number of detailed
other criteria

we would like to use to guide us in our decision, but this assumes that
there are communities to choose from. None of these criteria are set in
stone, and we are happy to discuss them, remove some if they are not good
ideas, or add others if we missed something. Regard this as a strawdog
proposal. For example, Mahir Morshed
 came up with a complementary
set of criteria on Phabricator
, which we will consider
in the selection as well. We will have Q&A office hours for discussion, and
are open to comments via wiki

or email.

We are thinking of a two-pronged approach:

   -

   first, to call for communities to propose themselves to work with us;
   -

   second, to look at the data and see which languages would be good
   candidates.


We don’t want to set too strict a process. We would like the second prong
of the approach to go on throughout the whole process to help us come to a
good understanding of the options.

For the first prong, we would like the candidate seed groups to describe
and nominate themselves on wiki, following a short form
.
Nominations should be submitted by April 7, and the decision will be made
by April 14 by the teams taking your comments into account. If we notice
that self-nominations are not happening, we will try to engage with
language communities directly.

It is possible that the two teams will choose different candidates,
although we will try to avoid that.

We are looking forward to hearing about what you think of this proposal.
Please comment on the talk page on wiki

.

Lydia and Denny
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Wikidata reaches Q100000000

2020-10-06 Thread Denny Vrandečić

A short blogpost by Lydia and me on the diff blog:

https://diff.wikimedia.org/2020/10/06/wikidata-reaches-q1/

Congratulations to the community, congratulations to the project!
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Publishing lexical masks for Wikidata lexicographic data

2020-06-22 Thread Denny Vrandečić

We have released lexical masks as ShEx files before, schemata for
lexicographic forms that can be used to validate whether the data is
complete.

We saw that it was quite challenging to turn these ShEx files into forms
for entering the data, such as Lucas Werkmeister’s Lexeme Forms. So we
adapted our approach slightly to publish JSON files that keep the
structures in an easier to parse and understand format, and to also provide
a script that translates these JSON files into ShEx Entity Schemas.

Furthermore, we published more masks for more languages and parts of speech
than before.

Full documentation can be found on wiki:
https://www.wikidata.org/wiki/Wikidata:Lexical_Masks#Paper

Background can be found in the paper:
https://www.aclweb.org/anthology/2020.lrec-1.372/

Thanks Bruno, Saran, and Daniel for your great work!
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [Wikidata-tech] New technical writer for Wikibase/Wikidata

2020-05-19 Thread Denny Vrandečić

Welcome to Wikidata! Thanks for taking on such an important task for
outreach!

On Tue, May 19, 2020 at 3:02 AM Dan Shick  wrote:

> Hi all!
>
> I’m Dan Shick ( https://w.wiki/RDs ), the new technical writer at
> Wikimedia Deutschland. My goals are to discover, improve, unify and
> round out documentation for the Wikibase & Wikidata development team;
> my specific duties are defined by my team leadership and the
> leadership of both products.
>
> I see a lot of documentation out there, and it needs organizing so
> that people of every audience can find the information they’re looking
> for. Audiences include volunteers & the community, employees of
> Wikimedia Deutschland and independent users of the products, and I see
> plenty of overlap between those groups. Perhaps most importantly, if
> the documentation someone needs doesn’t exist, I want to see it get
> written.
>
> My first task is to collect and improve the Wikibase post-install
> documentation. I have a lot of resources already on the table, but of
> course I welcome pointers to and feedback on any and all existing
> documentation.
>
> You'll find this text on my wiki page as well; if you want to say hi
> or have any questions or comments, feel free to shoot me an email or
> speak up on my talk page.
>
> Wiki: https://meta.wikimedia.org/wiki/User:Dan_Shick_(WMDE)
> Phabricator: https://phabricator.wikimedia.org/p/danshick-wmde/
>
> --
>
> Dan Shick
> Technical Writer
>
> Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
> Phone: +49 (0)30 219 158 26-0 (reception)
> https://wikimedia.de
>
> Stay up to date with news and stories about Wikimedia, Wikipedia and
> free knowledge by subscribing to our (German) newsletter.
>
> We envision a world where all human beings can freely share in the sum
> of all knowledge. Help us achieve that vision! Donate at
> https://spenden.wikimedia.de .
>
> Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e.
> V. Eingetragen im Vereinsregister des Amtsgerichts
> Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig
> anerkannt durch das Finanzamt für Körperschaften I Berlin,
> Steuernummer 27/029/42207.
>
> ___
> Wikidata-tech mailing list
> wikidata-t...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Proposal for a new Wikimedia project: Wikilambda

2020-05-05 Thread Denny Vrandečić

Hello all,

after talking about it a few times here, the official proposal for creating
the multilingual Wikipedia proposal is now on Meta.

https://meta.wikimedia.org/wiki/Wikilambda

The idea is to create abstract, language-independent content in Wikidata,
and then translate it into natural language using function. These functions
will be defined and maintained in a new Wikimedia project, which I
preliminary called Wikilambda.

Wikilambda will be a new Wikimedia project that allows to create, maintain,
catalog, and evaluate functions about all kind of things. You can find a
lot of further details in the link above. If you have any questions, I am
happy to answer them.

The official project proposal process basically says, make the proposal
here, and then go and tell everyone, and at some point, the Board might
look at this and say, yes good idea.

So I would love to collect many of your voices and support signatures, so
that I can go to the Board and tell them look at this :) So please sign
here:

https://meta.wikimedia.org/wiki/Talk:Wikilambda

Thank you,
Denny
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Partial RDF dumps

2020-05-01 Thread Denny Vrandečić

Kingsley,

thanks for suggesting that feature. Since you already have that feature,
could you let us know how often the UI option for these output formats are
used? That could help with prioritising.

My uninformed hunch would be that there isn't much demand for selecting the
format via the UI, and that is more relevant to have automated calls be
able to do that format selection, which the endpoint provides.

Thanks,
Denny


On Fri, May 1, 2020 at 9:59 AM Kingsley Idehen 
wrote:

> On 5/1/20 11:53 AM, Isaac Johnson wrote:
>
> If the challenge is downloading large files, you can also get local access
> to all of the dumps (wikidata, wikipedia, and more) through the PAWS
>  (Wikimedia-hosted Jupyter
> notebooks) and Toolforge
>  (more
> general-purpose Wikimedia hosting environment). From Toolforge, you could
> run the Wikidata toolkit (Java) that Denny mentions. I'm personally more
> familiar with Python, so my suggestion is to use Python code to filter down
> the dumps to what you desire. Below is an example Python notebook that will
> do this on PAWS, though the PAWS environment is not set up for these longer
> running jobs and will probably die before the process is complete, so I'd
> highly recommend converting it into a script that can run on Toolforge (see
> https://wikitech.wikimedia.org/wiki/Help:Toolforge/Dumps).
>
> PAWS example:
> https://paws-public.wmflabs.org/paws-public/User:Isaac_(WMF)/Simplified_Wikidata_Dumps.ipynb
>
> Best,
> Isaac
>
> That isn't my challenge.
>
> I wanted to know why the WDQ UI doesn't provide an option for CONSTRUCT
> and DESCRIBE query solutions using a variety of document types.
>
> See: https://wikidata.demo.openlinksw.com/sparql to see what I mean.
> Ditto any DBpedia endpoint.
>
>
> Kingsley
>
>
> On Thu, Apr 30, 2020 at 1:33 AM raffaele messuti 
> wrote:
>
>> On 27/04/2020 18:02, Kingsley Idehen wrote:
>> >> [1] https://w.wiki/PBi 
>> >>
>> > Do these CONSTRUCT queries return any of the following document
>> content-types?
>> >
>> > RDF-Turtle, RDF-XML, JSON-LD ?
>>
>> you can use content negotiation on the sparql endpoint
>>
>> ~ query="CONSTRUCT { ... }"
>> ~ curl -H "Accept: application/rdf+xml" https://query.wikidata.org/sparql
>> --data-urlencode query=$query
>> ~ curl -H "Accept: text/turtle" -G https://query.wikidata.org/sparql
>> --data-urlencode query=$query
>>
>>
>>
>> --
>> raffa...@docuver.se
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
>
> ___
> Wikidata mailing 
> listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> --
> Regards,
>
> Kingsley Idehen   
> Founder & CEO
> OpenLink Software
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Partial RDF dumps

2020-04-29 Thread Denny Vrandečić

CONSTRUCT would be best, but I am not sure that there's any system to
allows you to do that.

What I would do is get the truthy dump in ntriples, and filter out all
lines with the respective properties. The Wikidata Toolkit allows you to do
that and more.

https://www.mediawiki.org/wiki/Wikidata_Toolkit

On Mon, Apr 27, 2020 at 2:35 AM Ece Toprak  wrote:

> Hi,
>
> I am currently working on a NER project at school and would like to know
> if there is a way to generate RDF dumps that only contain "instance of" or
> "subclass of" relations.
>  I have found these dumps:
> RDF Exports from Wikidata
> 
> Here, under "simplified and derived dumps" taxonomy and instances dumps
> are very useful for me but unfortunately very old.
> It would be great if I could generate up to date dumps.
>
> Thank You,
> Alkım Ece Toprak
> Bogazici University
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Weekly Summary #411

2020-04-29 Thread Denny Vrandečić

If you're interested in all new created schemas, you can actually follow a
feed for that.

https://www.wikidata.org/wiki/Special:RecentChanges?hidebots=1&hideminor=1&hidepageedits=1&hidecategorization=1&hidelog=1&namespace=640&limit=50&days=14&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&urlversion=2

There's a link to the atom feed on the left hand toolbar.

Cheers,
Denny

On Tue, Apr 14, 2020 at 6:58 AM Léa Lacroix 
wrote:

> Hello Mike,
>
> The category is not especially new, one can fill it every week on
> https://www.wikidata.org/wiki/Wikidata:Status_updates/Next
> But it doesn't appear every week because I remove it when no one adds
> Schema examples.
> Basically, the idea is to add some interesting Schemas - not all the newly
> created ones, that would be too much, but any Schemas you find on Wikidata
> that you find especially well-described or interesting for the community to
> know about.
>
> To know more about EntitySchemas: Wikiproject Schemas
> 
>
> I hope that answers your question :)
> Cheers,
> Léa
>
> On Tue, 14 Apr 2020 at 15:50, Mike Bergman  wrote:
>
>> Hi,
>>
>> Are not the 'Schema examples' a new category of reporting in this
>> summary? Is there a wiki reference that describes this category more?
>>
>> Thanks!
>>
>> Best, Mike
>> On 4/14/2020 8:23 AM, Léa Lacroix wrote:
>>
>> *Here's your quick overview of what has been happening around Wikidata
>> over the last week.*
>> Discussions
>>
>>- CheckUser nominations: Sotiale
>>
>> ,
>>Jasper Deng
>>
>> ,
>>Romaine
>>
>> 
>>
>> Events 
>>
>>- Ongoing: Wikidata Lab XXII on the Wikidata Wikiproject COVID-19,
>>April 14, 1pm UTC, remote, in English. Event page
>>
>> 
>>- Upcoming: live SPARQL queries in French by Vigneron
>>, Tuesday April 14 at 20:00 CEST (UTC+2)
>>- Upcoming: Online Wikidata editathon in Swedish #9
>>
>> ,
>>April 19
>>- Past: Wikidata and Wikibase office hour, April 7, on Telegram. Notes
>>of the discussion
>>
>>- Past: Wikidata topic at the virtual biohackathon
>>.
>>Mid-term updates
>> final
>>presentations
>>
>> 
>>(Wikidata report starts at 1:17:40)
>>
>> Press, articles, blog posts, videos
>> 
>>
>>- *A protocol for adding knowledge to Wikidata, a case report
>>* (new
>>preprint about virus info (strains, genes, proteins), ShEx, and SPARQL)
>>- *Multilingual enrichment of disease biomedical ontologies
>>* ("We look at the coverage of
>>two biomedical ontologies focusing on diseases with respect to Wikidata 
>> for
>>9 European languages")
>>- *Wikidata and the bibliography of life in the time of coronavirus
>>
>> *
>>- Video: Editing Wikidata and creating a property proposal: YouTube
>>, Facebook
>>
>> ,
>>Periscope 
>>
>> Tool of the week
>>
>>- COVID19 Dashboard 
>>is a Wikidata-powered one-stop information/visualization service for
>>COVID19-related topics such as COVID19's outbreak map, deaths, symptoms,
>>taxonomy, and publications.
>>
>> Other Noteworthy Stuff
>>
>>- A database breakage
>>
>> ,
>>also affecting connected sister projects such as Wikipedia, on April 6,
>>11pm UTC. A fix has been deployed and no data has been lost. However,
>>issues related to sitelinks and bots creating duplicates can still occur.
>>- You're welcome to give feedback on ideas of improvements for the
>>Query Service interface
>>
>>

Re: [Wikidata] [Wikidata-tech] New Community Communications Manager for Wikidata/Wikibase

2020-04-21 Thread Denny Vrandečić

Mohammed,

welcome! I am very happy to see you join in this important role.

Thank you,
Denny



On Tue, Apr 21, 2020 at 9:11 AM Léa Lacroix 
wrote:

> Welcome onboard Mohammed!
> I'm glad that you're here and to have your support in order to address the
> requests from the community and to communicate about the software we are
> building at Wikimedia Germany :)
>
> On Tue, 21 Apr 2020 at 18:01, Mohammed Sadat Abdulai <
> mohammed.sadat_...@wikimedia.de> wrote:
>
>> Hi everyone,
>>
>> I hope you’re all having a great day!
>>
>> I’m super excited to announce that I’ll be joining the software
>> department at Wikimedia Germany to help advance engagement between the
>> software development team and the communities using and contributing to
>> Wikidata/Wikibase.
>>
>> Together with Léa, Sam and Lydia, I will be liaising with the different
>> user groups within the Wikibase community to provide information about
>> software changes and promote a smooth and productive collaboration between
>> stakeholders. You can share bug reports with us at <
>> https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team>
>>
>> Please leave a note on my talkpage <
>> https://www.wikidata.org/wiki/User_talk:Mohammed_Sadat_(WMDE)> or write
>> to me directly anytime you encounter issues with the Wikibase software so
>> that I can bring them to the development team:
>>
>>
>> * mohammed.sadat_...@wikimedia.de
>>
>> * Telegram (@masssly) and IRC (mabdulai)
>>
>> Best Regards,
>>
>> --
>>
>> Mohammed Sadat Abdulai
>>
>> *Community Communications Manager for Wikidata/Wikibase*
>>
>> Wikimedia Deutschland e.V.
>> Tempelhofer Ufer 23-24
>> 10963 Berlin
>> www.wikimedia.de
>> ___
>> Wikidata-tech mailing list
>> wikidata-t...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
>>
>
>
> --
> Léa Lacroix
> Project Manager Community Communication for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt
> für Körperschaften I Berlin, Steuernummer 27/029/42207.
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Proposal towards a multilingual Wikipedia and a new Wikipedia project

2020-04-20 Thread Denny Vrandečić

Hey Luca,

thank you so much for taking the time to read the proposal in detail, and
managing to get through it.

Yes, I agree, you are right. This proposal, just as Wikidata, is good at
exposing and identifying biases, but not that good at actually fixing them.
I also think that this is OK - the fixing should probably not be something
a tool should do, but that is up to us as the communities to get it done.

In fact, as you can find in the Wikipedia@20 essay

https://wikipedia20.pubpub.org/pub/vyf7ksah

one of the considerations for Abstract Wikipedia is indeed to make it more
explicit to see which biases are intentional in a language editions, and
which ones are not, in the hope that we can then tackle certain biases in
some language editions in a more targeted way.

Thank you for your kind words, and I am also very excited to get this thing
moving! :)

Stay safe,
Denny



On Fri, Apr 17, 2020 at 7:56 AM Luca Martinelli 
wrote:

> Hey Denny,
>
> I've finally managed to reach the conclusion of your paper. It's truly
> a lot to digest, especially for people like me who do not know much of
> these things, but it was a really thorough and interesting read.
>
> One thing that got me thinking is the part about biases - please note
> that it is not a critique, merely a *very* confused, and possibly
> fairly terrible, thought about it :)
>
> So, biases exist, they inform most of the discussion about delicate
> matters, and we're all "POV healthy carriers". In a way, this is fine
> to a certain limit. Many people, though, "refuse" (for a lack of
> better words) to acknowledge that in relation to themselves, and this
> is a huge problem to overcome when we need to find a way to establish
> a consensus about an as-much-as-NPOV-possible text about $subject.
>
> My fear is that all our attempts, from Wikidata to your proposal, are
> extremely good at noticing where bias is or might be, where we should
> point our attention to, but still aren't enough to fight back the
> refusal to acknowledge a bias. We still need a way to find people
> willing to tackle this, or at least giving them enough motivation to
> solve this in our current working framework.
>
> On the bright side, I can see so many applications of your project
> that I can't wait for it to happen. :)
>
> L.
>
>
>
>
>
> Il giorno mar 14 apr 2020 alle ore 03:10 Denny Vrandečić
>  ha scritto:
> >
> > I sent a long email to Wikimedia-l and also made the same post to Meta.
> I published a new paper recently with a proposal for a multilingual
> Wikipedia and more, and, unsurprisingly, Wikidata plays a central role in
> that proposal. I am trying to have the discussion not to be too fragmented,
> so I hope it will happen on Meta or on Wikimedia-l, but I also wanted to
> give a ping here.
> >
> > Stay safe!
> > Denny
> >
> > [1]
> https://lists.wikimedia.org/pipermail/wikimedia-l/2020-April/094621.html
> > [2]
> https://meta.wikimedia.org/wiki/Wikimedia_Forum#Proposal_towards_a_multilingual_Wikipedia_and_a_new_Wikipedia_project
> > [3] https://arxiv.org/abs/2004.04733
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> --
> Luca "Sannita" Martinelli
> http://it.wikipedia.org/wiki/Utente:Sannita
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Proposal towards a multilingual Wikipedia and a new Wikipedia project

2020-04-13 Thread Denny Vrandečić

I sent a long email to Wikimedia-l and also made the same post to Meta. I
published a new paper recently with a proposal for a multilingual Wikipedia
and more, and, unsurprisingly, Wikidata plays a central role in that
proposal. I am trying to have the discussion not to be too fragmented, so I
hope it will happen on Meta or on Wikimedia-l, but I also wanted to give a
ping here.

Stay safe!
Denny

[1] https://lists.wikimedia.org/pipermail/wikimedia-l/2020-April/094621.html
[2]
https://meta.wikimedia.org/wiki/Wikimedia_Forum#Proposal_towards_a_multilingual_Wikipedia_and_a_new_Wikipedia_project
[3] https://arxiv.org/abs/2004.04733
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata-powered COVID19 Dashboard

2020-04-12 Thread Denny Vrandečić

It is allowed (and in fact encouraged) to embed Wikidata query results in
your site. That was one of the original use cases.

Also, I wouldn't worry tremendously about the load on the Query server from
that as this is particularly well cachable and cached. To the best of my
knowledge, embedded queries are not the ones causing issues to WDQS.

On Sun, Apr 12, 2020 at 5:59 AM David McDonell  wrote:

> Perhaps consider joining/posting/linking this terrific new tool within
> this global collaboration community:
>
> https://covid-19.cognitive.city/cognitive/welcome
>
>
> On Sun, Apr 12, 2020 at 5:24 AM Jan Ainali  wrote:
>
>> Why would it not be allowed?
>> If it is not, WDQS is at least complicit since it under Link have the
>> option Embed result that gives you the iframe tag to use.
>>
>> Jan Ainali
>>
>> Den sön 12 apr. 2020 kl 10:28 skrev Stryn :
>>
>>> Not sure is it allowed to frame Wikidata query on your site, at least I
>>> don't like sites that have external site inside a frame.
>>> Also having many queries there it makes loading of your site slow. My
>>> phone was unable to load the site, it was crashing.
>>>
>>> *Stryn*
>>>
>>> *Wikimedia StewardAdmin and checkuser on the Finnish Wikipedia** |
>>> Admin on Wikidata*
>>> * | Admin on Meta-Wiki*
>>>
>>>
>>> On Sun, 12 Apr 2020 at 08:54, Markus Bärlocher <
>>> markus.baerloc...@lau-net.de> wrote:
>>>
 Hi, looks very nice!
 Big amount of data, nice layout! Thanks :-)

 But:
 The absolute number of persons is not very useful.
 Please use the relative number per 100'000 people.
 Or add at least the relative number.

 Thanks, Markus


 Am 12.04.2020 um 07:29 schrieb Fariz Darari:
 > Hello all,
 >
 > COVID19 Dashboard (https://sites.google.com/view/covid19-dashboard/),
 a
 > one-stop information/visualization service for COVID19-related topics,
 > is out now!
 >
 > The dashboard data is pulled from Wikidata, and displays COVID19's:
 > - Factbox
 > - Map
 > - Cases
 > - Deaths
 > - Victims
 > - Symptoms
 > - Possible Treatments
 > - Health Specialties
 > - Taxonomy
 > - Images
 > - Publications
 >
 > Take a look: https://sites.google.com/view/covid19-dashboard/
 >
 > Feedback is welcome, thanks!
 >
 > Regards,
 > Fariz

 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> --
> David McDonell Co-founder & CEO ICONICLOUD, Inc. "Illuminating the cloud"
> M: 703-864-1203 EM: da...@iconicloud.com URL: http://iconicloud.com
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Qualifiers/sources for YSO authority links

2020-03-13 Thread Denny Vrandečić

Yes, that sounds good to me.

Either create an item for that (preferred) or link to the URL directly.

On Fri, Mar 13, 2020, 04:06 Osma Suominen  wrote:

> Thanks Denny.
>
> Do you have a practical suggestion how to do this? There's no obvious
> source URL to refer to currently. What Joachim did was to set up a small
> document on GitHub and refer to that in the statements. Should I do
> something similar here?
>
> -Osma
>
> Denny Vrandečić kirjoitti 12.3.2020 klo 20.35:
> > When we were uploading the links to Freebase, we also added references
> > fro these. And since you've gone through all this work (thank you for
> > that!) verifying the links, I think it would be fair to add a respective
> > reference.
> >
> >
> >
> > On Tue, Mar 10, 2020 at 9:27 AM Osma Suominen  > <mailto:osma.suomi...@helsinki.fi>> wrote:
> >
> > Hi,
> >
> > I'm about to import around 7,000 P2347 mappings (YSO ID authority
> > links)
> > between Wikidata items and YSO (General Finnish Ontology) concepts to
> > Wikidata using QuickStatements2. I'm following the excellent example
> of
> > Joachim Neubert's work at ZBW, documented e.g. here:
> >
> http://zbw.eu/labs/en/blog/wikidata-as-authority-linking-hub-connecting-repec-and-gnd-researcher-identifiers
> >
> > The mappings were collected from several sources:
> > 1. Mappings between KOKO (related to YSO) and Wikidata curated by the
> > Finnish Broadcasting Company Yle (kindly given to us, but not
> publicly
> > available AFAIK)
> > 2. Indirect mappings derived from Wikidata-LCSH and YSO-LCSH mappings
> > 3. Algorithmic matching suggestions for frequently used YSO concepts
> >
> > In all these cases, the mappings have been verified by vocabulary
> > managers here at the National Library of Finland, so we're not just
> > blindly copying the information from the above sources.
> >
> > I'm wondering about whether to add source/qualifier statements to the
> > mapping statements I'm about to add. I see that in most cases,
> > authority
> > links don't have any source information. For this batch, I could
> > potentially document several bits of provenance information:
> >
> > 1. Where the (suggested) statement originally came from (e.g. Yle
> > and/or
> > indirect LCSH mapping)
> > 2. That we have verified it here at NLF
> >
> > I see that Joachim used source statements like this for his imported
> > links:
> >
> > title (P1476):
> > Derived from ZBW's RAS-GND authors mapping (English)
> >
> > reference URL (P854):
> >
> https://github.com/zbw/repec-ras/blob/master/doc/RAS-GND-author-id-mapping.md
> >
> > Is this still best practice or should I use something else? Or just
> > import the raw links without any qualifiers or sources?
> >
> > Thanks in advance,
> > Osma
> >
> > --
> > Osma Suominen
> > D.Sc. (Tech), Information Systems Specialist
> > National Library of Finland
> > P.O. Box 15 (Unioninkatu 36)
> > 00014 HELSINGIN YLIOPISTO
> > Tel. +358 50 3199529
> > osma.suomi...@helsinki.fi <mailto:osma.suomi...@helsinki.fi>
> > http://www.nationallibrary.fi
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Kaikukatu 4)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suomi...@helsinki.fi
> http://www.nationallibrary.fi
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Qualifiers/sources for YSO authority links

2020-03-12 Thread Denny Vrandečić

When we were uploading the links to Freebase, we also added references fro
these. And since you've gone through all this work (thank you for that!)
verifying the links, I think it would be fair to add a respective reference.



On Tue, Mar 10, 2020 at 9:27 AM Osma Suominen 
wrote:

> Hi,
>
> I'm about to import around 7,000 P2347 mappings (YSO ID authority links)
> between Wikidata items and YSO (General Finnish Ontology) concepts to
> Wikidata using QuickStatements2. I'm following the excellent example of
> Joachim Neubert's work at ZBW, documented e.g. here:
>
> http://zbw.eu/labs/en/blog/wikidata-as-authority-linking-hub-connecting-repec-and-gnd-researcher-identifiers
>
> The mappings were collected from several sources:
> 1. Mappings between KOKO (related to YSO) and Wikidata curated by the
> Finnish Broadcasting Company Yle (kindly given to us, but not publicly
> available AFAIK)
> 2. Indirect mappings derived from Wikidata-LCSH and YSO-LCSH mappings
> 3. Algorithmic matching suggestions for frequently used YSO concepts
>
> In all these cases, the mappings have been verified by vocabulary
> managers here at the National Library of Finland, so we're not just
> blindly copying the information from the above sources.
>
> I'm wondering about whether to add source/qualifier statements to the
> mapping statements I'm about to add. I see that in most cases, authority
> links don't have any source information. For this batch, I could
> potentially document several bits of provenance information:
>
> 1. Where the (suggested) statement originally came from (e.g. Yle and/or
> indirect LCSH mapping)
> 2. That we have verified it here at NLF
>
> I see that Joachim used source statements like this for his imported links:
>
> title (P1476):
> Derived from ZBW's RAS-GND authors mapping (English)
>
> reference URL (P854):
>
> https://github.com/zbw/repec-ras/blob/master/doc/RAS-GND-author-id-mapping.md
>
> Is this still best practice or should I use something else? Or just
> import the raw links without any qualifiers or sources?
>
> Thanks in advance,
> Osma
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 15 (Unioninkatu 36)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suomi...@helsinki.fi
> http://www.nationallibrary.fi
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Reduced loading times for Wikidata/Wikimedia Commons

2020-02-11 Thread Denny Vrandečić

That's awesome, thanks!

On Tue, Feb 11, 2020 at 3:46 AM Léa Lacroix 
wrote:

> Hey,
>
> On Tue, 11 Feb 2020 at 12:30, Nicolas VIGNERON 
> wrote:
>
>> Great !
>>
>> Two small perfectionist question:
>> - is it finished or can we maybe go further?
>>
> It is finished for now, but we should continue monitoring it so new
> modules don't accumulate again.
> In order to keep a low loading time, we should especially be careful about
> not adding too many gadgets to the projects (prefer individual scripts or
> external tools), and make sure that the existing or new gadgets are not
> requesting too many resources.
>
> - can it be done/replicated on other Wikimedia projects?
>>
> WMF has been doing it for other projects as well, see this blog post
> .
>
>
>>
>> Cheers, ~nicolas
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> Léa Lacroix
> Project Manager Community Communication for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt
> für Körperschaften I Berlin, Steuernummer 27/029/42207.
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Status of Wikidata Query Service

2020-02-11 Thread Denny Vrandečić

Oh, wow, I just tried that out too, and indeed, it used to be possible to
link to the L-number very quickly if I remember correctly, but now this is
not the case anymore.

Weirdly enough, the SPARQL endpoint got updated with the new Lexeme very
quickly. So I think these two things are not related.

On Tue, Feb 11, 2020 at 7:45 AM  wrote:

>
> I am sorry to bring more problems to the table, but the indexing of
> lexemes in the "ordinary" elasticsearch-based search is now also often
> slow. The Q-items are also indexed slowly, but there you can at least
> type in the Q-number in the edit field and it will lookup the item. For
> L-numbers, I have not found a way to type in and one would have to wait
> for some minutes before L-items are indexed.
>
> An example use case is the entry of "fordømme" and "dømme" where one
> links to the other by P5238, see
> https://www.wikidata.org/wiki/Lexeme:L245454. As is apparent from the
> edit histories, I waited over 10 minutes for the indexing before I could
> link the two lexemes.
>
>
> best regards
> Finn Årup Nielsen
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata Query Service update lag

2019-11-18 Thread Denny Vrandečić

I don't know if there is actually someone who would be capable and have the
time to do so, I just would hope there are such people - but it probably
makes sense to check if there are actually volunteers before doing work to
enable them :)

On Fri, Nov 15, 2019 at 5:17 AM Guillaume Lederrey 
wrote:

> On Fri, Nov 15, 2019 at 12:49 AM Denny Vrandečić 
> wrote:
>
>> Just wondering, is there a way to let volunteers look into the issue? (I
>> guess no because it would give potentially access to the query stream, but
>> maybe the answer is more optimistic)
>>
>
> There are ways, none of them easy. There are precedents for volunteers
> having access to our production environment. I'm not really sure what the
> process looks like. There is at least some NDA to sign and some vetting
> process. As you pointed out, this would give access to sensitive
> information, and to the ability to do great damage (power, responsibility
> and those kind of things).
>
> More realistically, we could provide more information for analysis. Heap
> dumps do contain private information, but thread dumps are pretty safe, so
> we could publish those. We would need to automate this on our side, but
> that might be an option. Of course, having access to limited information
> and no way to experiment on changes seriously limits the ability to
> investigate.
>
> I'll check with the team if that's something we are ready to invest in.
>
>
>> On Thu, Nov 14, 2019 at 2:39 PM Thad Guidry  wrote:
>>
>>> In the enterprise, most folks use either Java Mission Control, or just
>>> Java VisualVM profiler.  Seeing sleeping Threads is often good to start
>>> with, and just taking a snapshot or even Heap Dump when things are really
>>> grinding slow would be useful, you can later share those snapshots/heap
>>> dump with the community or Java profiling experts to analyze later.
>>>
>>> https://visualvm.github.io/index.html
>>>
>>> Thad
>>> https://www.linkedin.com/in/thadguidry/
>>>
>>>
>>> On Thu, Nov 14, 2019 at 1:46 PM Guillaume Lederrey <
>>> gleder...@wikimedia.org> wrote:
>>>
>>>> Hello!
>>>>
>>>> Thanks for the suggestions!
>>>>
>>>> On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry 
>>>> wrote:
>>>>
>>>>> Is the Write Retention Queue adequate?
>>>>> Is the branching factor for the lexicon indices too large, resulting
>>>>> in a non-linear slowdown in the write rate over tim?
>>>>> Did you look into Small Slot Optimization?
>>>>> Are the Write Cache Buffers adequate?
>>>>> Is there a lot of Heap pressure?
>>>>> Is the MemoryManager have the maximum amount of RAM it can handle?
>>>>> 4TB?
>>>>> Is the RWStore handling the recycling well?
>>>>> Is the SAIL Buffer Capacity adequate?
>>>>> Are you not using exact range counts where you could be using fast
>>>>> range counts?
>>>>>
>>>>>
>>>> Start at the Hardware side first however.
>>>>> Is the disk activity for writes really low...and CPU is very high?
>>>>> You have identified a bottleneck in that case, discover WHY that would be
>>>>> the case looking into any of the above.
>>>>>
>>>>
>>>> Sounds like good questions, but outside of my area of expertise. I've
>>>> created https://phabricator.wikimedia.org/T238362 to track it, and
>>>> I'll see if someone can have a look. I know that we did multiple passes at
>>>> tuning Blazegraph properties, with limited success so far.
>>>>
>>>>
>>>>> and a 100+ other things that should be looked at that all affect WRITE
>>>>> performance during UPDATES.
>>>>>
>>>>> https://wiki.blazegraph.com/wiki/index.php/IOOptimization
>>>>> https://wiki.blazegraph.com/wiki/index.php/PerformanceOptimization
>>>>>
>>>>> I would also suggest you start monitoring some of the internals of
>>>>> Blazegraph (JAVA) while in production with tools such as XRebel or
>>>>> AppDynamics.
>>>>>
>>>>
>>>> Both XRebel and AppDynamics are proprietary, so no way that we'll
>>>> deploy them in our environment. We are tracking a few JMX based metrics,
>>>> but so far, we don't really know what to look for.
>>>>
>>>> Thanks!
>>>>
>>>>   Guill

Re: [Wikidata] Wikidata Query Service update lag

2019-11-14 Thread Denny Vrandečić

Just wondering, is there a way to let volunteers look into the issue? (I
guess no because it would give potentially access to the query stream, but
maybe the answer is more optimistic)

On Thu, Nov 14, 2019 at 2:39 PM Thad Guidry  wrote:

> In the enterprise, most folks use either Java Mission Control, or just
> Java VisualVM profiler.  Seeing sleeping Threads is often good to start
> with, and just taking a snapshot or even Heap Dump when things are really
> grinding slow would be useful, you can later share those snapshots/heap
> dump with the community or Java profiling experts to analyze later.
>
> https://visualvm.github.io/index.html
>
> Thad
> https://www.linkedin.com/in/thadguidry/
>
>
> On Thu, Nov 14, 2019 at 1:46 PM Guillaume Lederrey <
> gleder...@wikimedia.org> wrote:
>
>> Hello!
>>
>> Thanks for the suggestions!
>>
>> On Thu, Nov 14, 2019 at 5:02 PM Thad Guidry  wrote:
>>
>>> Is the Write Retention Queue adequate?
>>> Is the branching factor for the lexicon indices too large, resulting in
>>> a non-linear slowdown in the write rate over tim?
>>> Did you look into Small Slot Optimization?
>>> Are the Write Cache Buffers adequate?
>>> Is there a lot of Heap pressure?
>>> Is the MemoryManager have the maximum amount of RAM it can handle?  4TB?
>>> Is the RWStore handling the recycling well?
>>> Is the SAIL Buffer Capacity adequate?
>>> Are you not using exact range counts where you could be using fast range
>>> counts?
>>>
>>>
>> Start at the Hardware side first however.
>>> Is the disk activity for writes really low...and CPU is very high?  You
>>> have identified a bottleneck in that case, discover WHY that would be the
>>> case looking into any of the above.
>>>
>>
>> Sounds like good questions, but outside of my area of expertise. I've
>> created https://phabricator.wikimedia.org/T238362 to track it, and I'll
>> see if someone can have a look. I know that we did multiple passes at
>> tuning Blazegraph properties, with limited success so far.
>>
>>
>>> and a 100+ other things that should be looked at that all affect WRITE
>>> performance during UPDATES.
>>>
>>> https://wiki.blazegraph.com/wiki/index.php/IOOptimization
>>> https://wiki.blazegraph.com/wiki/index.php/PerformanceOptimization
>>>
>>> I would also suggest you start monitoring some of the internals of
>>> Blazegraph (JAVA) while in production with tools such as XRebel or
>>> AppDynamics.
>>>
>>
>> Both XRebel and AppDynamics are proprietary, so no way that we'll deploy
>> them in our environment. We are tracking a few JMX based metrics, but so
>> far, we don't really know what to look for.
>>
>> Thanks!
>>
>>   Guillaume
>>
>> Thad
>>> https://www.linkedin.com/in/thadguidry/
>>>
>>>
>>> On Thu, Nov 14, 2019 at 7:31 AM Guillaume Lederrey <
>>> gleder...@wikimedia.org> wrote:
>>>
 Thanks for the feedback!

 On Thu, Nov 14, 2019 at 11:11 AM  wrote:

>
> Besides waiting for the new updater, it may be useful to tell us, what
> we as users can do too. It is unclear to me what the problem is. For
> instance, at one point I was worried that the many parallel requests
> to
> the SPARQL endpoint that we make in Scholia is a problem. As far as I
> understand it is not a problem at all. Another issue could be the way
> that we use Magnus Manske's Quickstatements and approve bots for high
> frequency editing. Perhaps a better overview and constraints on
> large-scale editing could be discussed?
>

 To be (again) completely honest, we don't entirely understand the issue
 either. There are clearly multiple related issues. In high level terms, we
 have at least:

 * Some part of the update process on Blazegraph is CPU bound and single
 threaded. Even with low query load, if we have a high edit rate, Blazegraph
 can't keep up, and saturates a single CPU (with plenty of available
 resources on other CPUs). This is a hard issue to fix, requiring either
 splitting the processing over multiple CPU or sharding the data over
 multiple servers. Neither of which Blazegraph supports (at least not in our
 current configuration).
 * There is a race for resources between edits and queries: a high query
 load will impact the update rate. This could to some extent be mitigated by
 reducing the query load: if no one is using the service, it works great!
 Obviously that's not much of a solution.

 What you can do (short term):

 * Keep bots usage well behaved (don't do parallel queries, provide a
 meaningful user agent, smooth the load over time if possible, ...). As far
 as I can see, most usage are already well behaved.
 * Optimize your queries: better queries will use less resources, which
 should help. Time to completion is a good approximation of the resources
 used. I don't really have any more specific advice, SPARQL is not my area
 of expertise.

 What you can do (longer term):

[Wikidata] Comparison of Wikidata, DBpedia, and Freebase (draft and invitation)

2019-09-30 Thread Denny Vrandečić

Hi all,

as promised, now that I am back from my trip, here's my draft of the
comparison of Wikidata, DBpedia, and Freebase.

It is a draft, it is obviously potentially biased given my background,
etc., but I hope that we can work on it together to get it into a good
shape.

Markus, amusingly I took pretty much the same example that you went for,
the parent predicate. So yes, I was also surprised by the results, and
would love to have Sebastian or Kingsley look into it and see if I
conducted it fairly.

SJ, Andra, thanks for offering to take a look. I am sure you all can
contribute your own unique background and make suggestions on how to
improve things and whether the results ring true.

Marco, I totally agree with what you said - the project has stalled, and
there is plenty of opportunity to harvest more data from Freebase and bring
it to Wikidata, and this should be reignited. Sebastian, I also agree with
you, and the numbers do so too, the same is true with the extraction
results from DBpedia.

Sebastian, Kingsley, I tried to describe how I understand DBpedia, and all
steps should be reproducible. As it seems that the two of you also have to
discuss one or the other thing about DBpedia's identity, I am relieved that
my confusion is not entirely unjustified. So I tried to use both the last
stable DBpedia release as well as a new-style DBpedia fusion dataset for
the comparison. But I might have gotten the whole procedure wrong. I am
happy to be corrected.

On Sat, Sep 28, 2019 at 12:28 AM  wrote:

> > Meanwhile, Google crawls all the references and extracts facts from
there. We don't
> have that available, but there is Linked Open Data.

Potentially, not a bad idea, but we don't do that.

Everyone, this is the first time I share a Colab notebook, and I have no
idea if I did it right. So any feedback of the form "oh you didn't switch
on that bit over here" or "yes, this works, thank you" is very welcome,
because I have no clue what I am doing :) Also, I never did this kind of
analysis so transparently, which is kinda both totally cool and rather
scary, because now you can all see how dumb I am :)

So everyone is invited to send Pull Requests (I guess that's how this
works?), and I would love for us to create a result together that we agree
on. I see the result of this exercise to be potentially twofold:

1) a publication we can point people to who ask about the differences
between Wikidata, DBpedia, and Freebase

2) to reignite or start projects and processes to reduce these differences

So, here is the link to my Colab notebook:

https://github.com/vrandezo/colabs/blob/master/Comparing_coverage_and_accuracy_of_DBpedia%2C_Freebase%2C_and_Wikidata_for_the_parent_predicate.ipynb

Ideally, the third goal could be to get to a deeper understanding of how
these three projects relate to each other - in my point of view, Freebase
is dead and outdated, Wikidata is the core knowledge base that anyone can
edit, and DBpedia is the core project to weave value-adding workflows on
top of Wikidata or other datasets from the linked open data cloud together.
But that's just a proposal.

Cheers,
Denny

On Sat, Sep 28, 2019 at 12:28 AM  wrote:

> Hi Gerard,
>
> I was not trying to judge here. I was just saying that it wasn't much data
> in the end.
> For me Freebase was basically cherry-picked.
>
> Meanwhile, the data we extract is more pertinent to the goal of having
> Wikidata cover the info boxes. We still have ~ 500 million statements left.
> But none of it is used yet. Hopefully we can change that.
>
> Meanwhile, Google crawls all the references and extracts facts from there.
> We don't have that available, but there is Linked Open Data.
>
> --
> Sebastian
>
> On September 27, 2019 5:26:43 PM GMT+02:00, Gerard Meijssen <
> gerard.meijs...@gmail.com> wrote:
>>
>> Hoi,
>> I totally reject the assertion was so bad. I have always had the opinion
>> that the main issue was an atrocious user interface. Add to this the people
>> that have Wikipedia notions about quality. They have and had a detrimental
>> effect on both the quantity and quality of Wikidata.
>>
>> When you add the functionality that is being build by the datawranglers
>> at DBpedia, it becomes easy/easier to compare the data from Wikipedias with
>> Wikidata (and why not Freebase) add what has consensus and curate the
>> differences. This will enable a true datasense of quality and allows us to
>> provide a much improved service.
>> Thanks,
>>   GerardM
>>
>> On Fri, 27 Sep 2019 at 15:54, Marco Fossati 
>> wrote:
>>
>>> Hey Sebastian,
>>>
>>> On 9/20/19 10:22 AM, Sebastian Hellmann wrote:
>>> > Not much of Freebase did end up in Wikidata.
>>>
>>> Dropping here some pointers to shed light on the migration of Freebase
>>> to Wikidata, since I was partially involved in the process:
>>> 1. WikiProject [1];
>>> 2. the paper behind [2];
>>> 3. datasets to be migrated [3].
>>>
>>> I can confirm that the migration has stalled: as of today, *528
>>> thousands* Free

Re: [Wikidata] Personal news: a new role

2019-09-20 Thread Denny Vrandečić

Thanks everyone for this warm welcome (back)!



On Fri, Sep 20, 2019, 10:38 Denny Vrandečić  wrote:

> Off to my Todo list :)
>
> On Thu, Sep 19, 2019 at 10:46 AM Andy Mabbett 
> wrote:
>
>> On Thu, 19 Sep 2019 at 17:56, Denny Vrandečić 
>> wrote:
>>
>> > I am moving to a new role in Google Research, akin to a Wikimedian in
>> > Residence
>>
>> That's marvelous; congratulations.
>>
>> Please bear in mind this project:
>>
>>https://commons.wikimedia.org/wiki/Commons:Voice_intro_project
>>
>> and the Googlers who have Wikipedia articles about them (or, indeed,
>> Wikidata items).
>>
>> --
>> Andy Mabbett
>> @pigsonthewing
>> http://pigsonthewing.org.uk
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Denny Vrandečić

I would love your input! I will send the link here, and any contribution
will be welcome :)

Thank you!

On Fri, Sep 20, 2019 at 11:05 AM Samuel Klein  wrote:

> I'm also interested in this comparison and intersection, and glad to share
> perspective + help.  Warmly, SJ
>
> On Fri, Sep 20, 2019 at 1:32 PM Denny Vrandečić 
> wrote:
>
>> Yes, you're touching exactly on the problems I had during the evaluation
>> - I couldn't even figure out what DBpedia is. Thanks, your help will be
>> very much appreciated.
>>
>> OK, I will send a link the week after the next, and then we can start
>> working on it :) I am very much looking forward to it.
>>
>> On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
>> hellm...@informatik.uni-leipzig.de> wrote:
>>
>>> Na, I am quite open, albeit impulsive. The information given was quite
>>> good and some of my concerns regarding the involvement of Google were also
>>> lifted or relativized. Mainly due to the fact that there seems to be a
>>> sense of awareness.
>>>
>>> I am just studying  economic principles, which are very powerful. I also
>>> have the feeling that free and open stuff just got a lot more commercial
>>> and I am still struggling with myself whether this is good or not. Also
>>> whether DBpedia should become frenemies with BigTech. Or funny things like
>>> many funding agencies try to push for national sustainability options, but
>>> most of the time, they suggest to use the GitHub Platform. Wikibase could
>>> be an option here.
>>>
>>> I have to apologize for the Knowledge Graph Talk thing. I was a bit
>>> grumpy, because I thought I wasted a lot of time on the Talk page that
>>> could have been invested in making the article better (WP:BE_BOLD style),
>>> but now I think, it might have been my own mistake. So apologies for
>>> lashing out there.
>>>
>>> (see comments below)
>>> On 20.09.19 17:53, Denny Vrandečić wrote:
>>>
>>> Sebastian,
>>>
>>> "I don't want to facilitate conspiracy theories, but ..."
>>> "[I am] interested in what is the truth behind the truth"
>>>
>>> I am sorry, I truly am, but this *is* the language I know from
>>> conspiracy theorists. And given that, I cannot imagine that there is
>>> anything I can say that could convince you otherwise. Therefore there is no
>>> real point for me in engaging with this conversation on these terms, I
>>> cannot see how it would turn constructive.
>>>
>>> The answers to many of your questions are public and on the record.
>>> Others tried to point you to them (thanks), but you dismiss them as not
>>> fitting your narrative.
>>>
>>> So here's a suggestion, which I think might be much more constructive
>>> and forward-looking:
>>>
>>> I have been working on a comparison of DBpedia, Wikidata, and Freebase
>>> (and since you've read my thesis, you know that's a thing I know a bit
>>> about). Simple evaluation, coverage, correctness, nothing dramatically
>>> fancy. But I am torn about publishing it, because, d'oh, people may (with
>>> good reasons) dismiss it as being biased. And truth be told - the simple
>>> fact that I don't know DBpedia as well as I know Wikidata and Freebase
>>> might indeed have lead to errors, mistakes, and stuff I missed in the
>>> evaluation. But you know what would help?
>>>
>>> You.
>>>
>>> My suggestion is that I publish my current draft, and then you and me
>>> work together on it, publically, in the open, until we reach a state we
>>> both consider correct enough for publication.
>>>
>>> What do you think?
>>>
>>> Sure, we are doing statistics at the moment as well. It is a bit hard to
>>> define what DBpedia is nowadays as we are rebranding the remixed datasets,
>>> now that we can pick up links and other data from the Databus. It might not
>>> even be a real dataset anymore, but glue between datasets focusing on the
>>> speed of integration and ease of quality improvement. Also still working on
>>> the concrete Sync Targets for GlobalFactSync (
>>> https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
>>> as well.
>>>
>>> One question I have is whether Wikidata is effective/efficient or where
>>> it is effective and where it could use improvement as a chance for
>>> collaboration.
>>>
&

Re: [Wikidata] Personal news: a new role

2019-09-20 Thread Denny Vrandečić

Off to my Todo list :)

On Thu, Sep 19, 2019 at 10:46 AM Andy Mabbett 
wrote:

> On Thu, 19 Sep 2019 at 17:56, Denny Vrandečić 
> wrote:
>
> > I am moving to a new role in Google Research, akin to a Wikimedian in
> > Residence
>
> That's marvelous; congratulations.
>
> Please bear in mind this project:
>
>https://commons.wikimedia.org/wiki/Commons:Voice_intro_project
>
> and the Googlers who have Wikipedia articles about them (or, indeed,
> Wikidata items).
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Personal news: a new role

2019-09-20 Thread Denny Vrandečić

I wouldn't know if this is the first time, particularly because I am not
sure how to describe it. Google is contributing a lot to all kind of
efforts, many of them open, as you know, so, in short, it depends on your
definition :)

My position is located in Google AI (i.e. Google Research), not the Open
Source office. But I plan to publish more code forthgoing, so I will
continue to work with them.

On Thu, Sep 19, 2019 at 10:35 AM Federico Leva (Nemo) 
wrote:

> Denny Vrandečić, 19/09/19 19:56:
> > I had used my 20% time to support such teams. The requests became more
> > frequent, and now I am moving to a new role in Google Research, akin to
> > a Wikimedian in Residence
>
> That's very interesting! Is it the first free culture project for which
> something of the like happens? From what you write, I understand it will
> be something separate from the Google Open Source office, right?
>
> Federico
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Denny Vrandečić

Yes, you're touching exactly on the problems I had during the evaluation -
I couldn't even figure out what DBpedia is. Thanks, your help will be
very much appreciated.

OK, I will send a link the week after the next, and then we can start
working on it :) I am very much looking forward to it.

On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Na, I am quite open, albeit impulsive. The information given was quite
> good and some of my concerns regarding the involvement of Google were also
> lifted or relativized. Mainly due to the fact that there seems to be a
> sense of awareness.
>
> I am just studying  economic principles, which are very powerful. I also
> have the feeling that free and open stuff just got a lot more commercial
> and I am still struggling with myself whether this is good or not. Also
> whether DBpedia should become frenemies with BigTech. Or funny things like
> many funding agencies try to push for national sustainability options, but
> most of the time, they suggest to use the GitHub Platform. Wikibase could
> be an option here.
>
> I have to apologize for the Knowledge Graph Talk thing. I was a bit
> grumpy, because I thought I wasted a lot of time on the Talk page that
> could have been invested in making the article better (WP:BE_BOLD style),
> but now I think, it might have been my own mistake. So apologies for
> lashing out there.
>
> (see comments below)
> On 20.09.19 17:53, Denny Vrandečić wrote:
>
> Sebastian,
>
> "I don't want to facilitate conspiracy theories, but ..."
> "[I am] interested in what is the truth behind the truth"
>
> I am sorry, I truly am, but this *is* the language I know from conspiracy
> theorists. And given that, I cannot imagine that there is anything I can
> say that could convince you otherwise. Therefore there is no real point for
> me in engaging with this conversation on these terms, I cannot see how it
> would turn constructive.
>
> The answers to many of your questions are public and on the record. Others
> tried to point you to them (thanks), but you dismiss them as not fitting
> your narrative.
>
> So here's a suggestion, which I think might be much more constructive and
> forward-looking:
>
> I have been working on a comparison of DBpedia, Wikidata, and Freebase
> (and since you've read my thesis, you know that's a thing I know a bit
> about). Simple evaluation, coverage, correctness, nothing dramatically
> fancy. But I am torn about publishing it, because, d'oh, people may (with
> good reasons) dismiss it as being biased. And truth be told - the simple
> fact that I don't know DBpedia as well as I know Wikidata and Freebase
> might indeed have lead to errors, mistakes, and stuff I missed in the
> evaluation. But you know what would help?
>
> You.
>
> My suggestion is that I publish my current draft, and then you and me work
> together on it, publically, in the open, until we reach a state we both
> consider correct enough for publication.
>
> What do you think?
>
> Sure, we are doing statistics at the moment as well. It is a bit hard to
> define what DBpedia is nowadays as we are rebranding the remixed datasets,
> now that we can pick up links and other data from the Databus. It might not
> even be a real dataset anymore, but glue between datasets focusing on the
> speed of integration and ease of quality improvement. Also still working on
> the concrete Sync Targets for GlobalFactSync (
> https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
> as well.
>
> One question I have is whether Wikidata is effective/efficient or where it
> is effective and where it could use improvement as a chance for
> collaboration.
>
> So yes any time.
>
> -- Sebastian
>
>
> Cheers,
> Denny
>
> P.S.: I am travelling the next week, so I may ask for patience
>
>
> On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry  wrote:
>
>> Thank you for sharing your opinions, Sebastian.
>>
>> Cheers,
>> Thad
>> https://www.linkedin.com/in/thadguidry/
>>
>>
>> On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
>> hellm...@informatik.uni-leipzig.de> wrote:
>>
>>> Hi Thad,
>>> On 20.09.19 15:28, Thad Guidry wrote:
>>>
>>> With my tech evangelist hat on...
>>>
>>> Google's philanthropy is nearly boundless when it comes to the promotion
>>> of knowledge.  Why? Because indeed it's in their best interest otherwise no
>>> one can prosper without knowledge.  They aggregate knowledge for the
>>> benefit of mankind, and then make a profit through advertisin

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Denny Vrandečić

Sebastian,

"I don't want to facilitate conspiracy theories, but ..."
"[I am] interested in what is the truth behind the truth"

I am sorry, I truly am, but this *is* the language I know from conspiracy
theorists. And given that, I cannot imagine that there is anything I can
say that could convince you otherwise. Therefore there is no real point for
me in engaging with this conversation on these terms, I cannot see how it
would turn constructive.

The answers to many of your questions are public and on the record. Others
tried to point you to them (thanks), but you dismiss them as not fitting
your narrative.

So here's a suggestion, which I think might be much more constructive and
forward-looking:

I have been working on a comparison of DBpedia, Wikidata, and Freebase (and
since you've read my thesis, you know that's a thing I know a bit about).
Simple evaluation, coverage, correctness, nothing dramatically fancy. But I
am torn about publishing it, because, d'oh, people may (with good reasons)
dismiss it as being biased. And truth be told - the simple fact that I
don't know DBpedia as well as I know Wikidata and Freebase might indeed
have lead to errors, mistakes, and stuff I missed in the evaluation. But
you know what would help?

You.

My suggestion is that I publish my current draft, and then you and me work
together on it, publically, in the open, until we reach a state we both
consider correct enough for publication.

What do you think?

Cheers,
Denny

P.S.: I am travelling the next week, so I may ask for patience


On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry  wrote:

> Thank you for sharing your opinions, Sebastian.
>
> Cheers,
> Thad
> https://www.linkedin.com/in/thadguidry/
>
>
> On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
> hellm...@informatik.uni-leipzig.de> wrote:
>
>> Hi Thad,
>> On 20.09.19 15:28, Thad Guidry wrote:
>>
>> With my tech evangelist hat on...
>>
>> Google's philanthropy is nearly boundless when it comes to the promotion
>> of knowledge.  Why? Because indeed it's in their best interest otherwise no
>> one can prosper without knowledge.  They aggregate knowledge for the
>> benefit of mankind, and then make a profit through advertising ... all
>> while making that knowledge extremely easy to be found for the world.
>>
>>
>> I am neither pro-Google or anti-Google per se. Maybe skeptical and
>> interested in what is the truth behind the truth. Google is not synonym to
>> philanthropy. Wikimedia is or at least I think they are doing many things
>> right. Google is a platform, so primarily they "aggregate knowledge for
>> their benefit" while creating enough incentives in form of accessibility
>> for users to add the user's knowledge to theirs. It is not about what
>> Google offers, but what it takes in return. 20% of employees time is also
>> an investment in the skill of the employee, a Google asset called Human
>> Capital and also leads to me and Denny from Google discussing whether
>> https://en.wikipedia.org/wiki/Talk:Knowledge_Graph is content marketing
>> or knowledge (@Denny: no offense, legit arguments, but no agenda to resolve
>> the stalled discussion there). Except I don't have 20% time to straighten
>> the view into what I believe would be neutral, so pushing it becomes a
>> resource issue.
>>
>> I found the other replies much more realistic and the perspective is yet
>> unclear. Maybe Mozilla wasn't so much frenemy with Google and got removed
>> from the browser market for it. I am also thinking about Linked Open Data.
>> Decentralisation is quite weak, individually. I guess spreading all the
>> Wikibases around to super-nodes is helpful unless it prevents the formation
>> of a stronger lobby of philanthropists or competition to BigTech. Wikidata
>> created some pressure on DBpedia as well (also opportunities), but we are
>> fine since we can simply innovate. Others might not withstand. Microsoft
>> seems to favor OpenStreetMaps so I am just asking to which degree Open
>> Source and Open Data is being instrumentalised by BigTech.
>>
>> Hence my question, whether it is compromise or be removed. (Note that
>> states are also platforms, which measure value in GDP and make laws and
>> roads and take VAT on transactions. Sometimes, they even don't remove
>> opposition.)
>>
>> --
>> All the best,
>> Sebastian Hellmann
>>
>> Director of Knowledge Integration and Linked Data Technologies (KILT)
>> Competence Center
>> at the Institute for Applied Informatics (InfAI) at Leipzig University
>> Executive Director of the DBpedia Association
>> Projects: http://dbpedia.org, http://nlp2rdf.org,
>> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
>> 
>> Homepage: http://aksw.org/SebastianHellmann
>> Research Group: http://aksw.org
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_

Re: [Wikidata] Dead or alive ? Probably dead

2019-09-19 Thread Denny Vrandečić

I think if we wanted to do this with a bot, we should go through the usual
bot approval process, and discuss this on wiki?

But in general, as said, adding unknown value to people who are very very
sure to be dead sounds like a good idea (
https://www.wikidata.org/wiki/Q28 )

On Thu, Sep 19, 2019 at 3:25 PM Olaf Simons 
wrote:

> On FactGrid we created two properties for this (maybe clever, maybe daft):
> P290 and P291 for estimates (or for knowledge) of an earliest and latest
> point in the life span. The necessity is here that we have loads of people
> with just a single data point like "studied in Jena in 1776" or "appeared
> on a list of voters in 1849". If that is all you know, you do actually know
> that the person is likely to have a birth date some 17 (or in the voters
> case at least 21) years before.
>
> If a person is only once mentioned as retired that stretches the P290 date
> to some 60 years before and so on - you qualify the estimate accordingly.
>
> I have no idea whether this is a good move on our site since we are not
> really that advanced in running the more intriguing SPARQL searches.
>
> Olaf
>
>
>
>
> > Fabrizio Carrai  hat am 19. September 2019
> um 22:13 geschrieben:
> >
> >
> > So, the question is if it would be fine and ethic to set the "Date of
> > death" to "unknown" on the base of an old date of birth.
> > And about the biography of living persons, I found this [1]
> >
> > Deceased persons, corporations, or groups of personsRecently dead or
> > probably dead
> > Anyone born within the past 115 years (on or after 19 September 1904) is
> > covered by this policy unless a reliable source has confirmed their
> death.
> > Generally, this policy does not apply to material concerning people who
> are
> > confirmed dead by reliable sources. The only exception would be for
> people
> > who have recently died, in which case the policy can extend for an
> > indeterminate period beyond the date of death—six months, one year, two
> > years at the outside. Such extensions would apply particularly to
> > contentious or questionable material about the dead that has implications
> > for their living relatives and friends, such as in the case of a possible
> > suicide or a particularly gruesome crime. *Even absent confirmation of
> > death, for the purposes of this policy anyone born more than 115 years
> ago
> > is presumed dead* *unless* reliable sources confirm the person to have
> been
> > living within the past two years. If the date of birth is unknown,
> editors
> > should use reasonable judgement to infer—from dates of events noted in
> the
> > article—if it is plausible that the person was born within the last 115
> > years and is therefore covered by this policy.
> >
> > This would support the set of "Date of death" to "unknown" on the base of
> > the "Date of birth". It remains hard to verify typo errors, but we are
> > doing our best to verify the data of the several wikiprojects.
> >
> > The property set would become effective if done in mass by a bot or
> similar.
> >
> > By the way, I would extend be period to 122 years [2]
> >
> > FabC
> >
> > [1]
> >
> https://en.wikipedia.org/wiki/Wikipedia:Biographies_of_living_persons#Deceased_persons,_corporations,_or_groups_of_persons
> > [2] https://en.wikipedia.org/wiki/Oldest_people
> >
> > Il giorno gio 19 set 2019 alle ore 21:29 Andy Mabbett <
> > a...@pigsonthewing.org.uk> ha scritto:
> >
> > > On Sat, 7 Sep 2019 at 07:53, Fabrizio Carrai <
> fabrizio.car...@gmail.com>
> > > wrote:
> > >
> > > > I found athletes with the "Date of born" but with NO "date of death".
> > > > So a query on the age show me athletes up to 149 years old.
> > > > Since the oldest know person was 122, what about to set "date of
> > > > death = unknown value" for all the persons resulting older such age ?
> > >
> > > Yes, but check that the date of birth isn't a typo (i.e. 1875 instead
> > > of 1975; or 1894 instead of 1984).
> > >
> > > Showing a living person as being dead would be a serious breach of the
> > > BLP policy.
> > >
> > > --
> > > Andy Mabbett
> > > @pigsonthewing
> > > http://pigsonthewing.org.uk
> > >
> > > ___
> > > Wikidata mailing list
> > > Wikidata@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikidata
> > >
> >
> >
> > --
> > *Fabrizio*
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>
> Dr. Olaf Simons
> Forschungszentrum Gotha der Universität Erfurt
> Schloss Friedenstein, Pagenhaus
> 99867 Gotha
>
> Büro: +49-361-737-1722 <+49%20361%207371722>
> Mobil: +49-179-5196880 <+49%20179%205196880>
>
> Privat: Hauptmarkt 17b/ 99867 Gotha
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikid

Re: [Wikidata] Dead or alive ? Probably dead

2019-09-19 Thread Denny Vrandečić

"unknown value" was made for exactly that use case - a person that has
died, but we don't know when.

I would just add that on the "date of death" property.

On Sat, Sep 7, 2019 at 2:25 AM Thomas Douillard 
wrote:

> We have already a qualifier for this kind of stuffs, I think : P887
>  this is a bit of a corner
> case because it’s not the value that is computed here but the existence of
> a value, but I think it will do. We just need an item for this, something
> such as « most likely dead because born long before » should do the trick.
>
> Le sam. 7 sept. 2019 à 09:14, Federico Leva (Nemo)  a
> écrit :
>
>> Fabrizio Carrai, 07/09/19 09:53:
>> > Since the oldest know person was 122, what about to set "date of death
>> =
>> > unknown value" for all the persons resulting older such age ?
>>
>> It seems to me a sensible thing to do. It's good you asked because it's
>> better to avoid the risk of conflicting mass changes.
>>
>> I wounder if we need a qualifier to allow identifying this as an
>> inferred piece of data: do people sometimes state "unknown value" when
>> someone is known to be dead, but we don't know when they did? I would
>> place a date of death with a precision of a decade or century in such a
>> case, but I've not checked what's the frequency of such qualifiers yet.
>>
>> Federico
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Personal news: a new role

2019-09-19 Thread Denny Vrandečić

Hello all,

Over the last few years, more and more research teams all around the world
have started to use Wikidata. Wikidata is becoming a fundamental resource
[1]. That is also true for research at Google. One advantage of using
Wikidata as a research resource is that it is available to everyone.
Results can be reproduced and validated externally. Yay!

I had used my 20% time to support such teams. The requests became more
frequent, and now I am moving to a new role in Google Research, akin to a
Wikimedian in Residence [2]: my role is to promote understanding of the
Wikimedia projects within Google, work with Googlers to share more
resources with the Wikimedia communities, and to facilitate the improvement
of Wikimedia content by the Wikimedia communities, all with a strong focus
on Wikidata.

One deeply satisfying thing for me is that the goals of my new role and the
goals of the communities are so well aligned: it is really about improving
the coverage and quality of the content, and about pushing the projects
closer towards letting everyone share in the sum of all knowledge.

Expect to see more from me again - there are already a number of fun ideas
in the pipeline, and I am looking forward to see them get out of the gates!
I am looking forward to hearing your ideas and suggestions, and to continue
contributing to the Wikimedia goals.

Cheers,
Denny

P.S.: Which also means, incidentally, that my 20% time is opening for new
shenanigans [3].

[1] https://www.semanticscholar.org/search?q=wikidata&sort=relevance
[2] https://meta.wikimedia.org/wiki/Wikimedian_in_residence
[3] https://wikipedia20.pubpub.org/pub/vyf7ksah
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] "Collaborating on the sum of all knowledge across languages"

2019-09-06 Thread Denny Vrandečić

Hi Gerard,

thank you for your comments. I agree with them - the generated text
shouldn't be stored in the local Wikipedias, but merely cached. I have
updated the text accordingly to make it explicit.

Thanks!
Cheers,
Denny


On Sat, Jul 6, 2019 at 11:09 PM Gerard Meijssen 
wrote:

> Hoi,
> I do not have a profile there and I am nowadays reluctant to add profiles.
>
> Having said that. The data that will be in an abstract Wikipedia allows
> for the generation of text in any language. What it does is enable the
> generation of texts on the fly to be used as a complement to any Wikipedia.
> Best is not to save generated texts but to cache them. We could even seek
> for the routines and data used to generate articles for for instance the
> Cebuano Wikipedia, remove the completely generated articles and keep the
> same level of service. In this way it is for Wikipedians to write text and
> have generated text as a basis to improve upon.
>
> Hidden in Reasonator there is the "concept cloud". This is just one
> example [1]. It shows the items that are common in the Wikipedias for an
> item. With an abstract Wikipedia we can compare and indicate what articles
> do not have links to concepts that are known to be good. There may be all
> kinds of reasons for that but it is easy to find the wrong reasons when you
> compare it with what others include. You will find the problematic, the
> erroneous and the bigoted links. When we do we could override the delivery
> of that article with a choice to the "abstract" for comparison.
> Thanks,
>GerardM
>
>
>
> [1]
> https://tools.wmflabs.org/wikidata-todo/cloudy_concept.php?q=Q434706&lang=en
>
> On Sat, 6 Jul 2019 at 23:53, Denny Vrandečić  wrote:
>
>> Hi all!
>>
>> I really try not to spam the chat too much with pointers to my work on
>> the Abstract Wikipedia, but this one is probably also interesting for
>> Wikidata contributors. It is the draft for a chapter submitted to Koerner
>> and Reagle's Wikipedia@20 book, and talks about knowledge diversity
>> under the light of centralisation through projects such as Wikidata.
>>
>> Public commenting phase is open until July 19, and very welcome:
>> "Collaborating on the sum of all knowledge across languages"
>>
>> About the book: https://meta.wikimedia.org/wiki/Wikipedia@20
>> Link to chapter: https://wikipedia20.pubpub.org/pub/vyf7ksah
>>
>> Cheers,
>> Denny
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-08-13 Thread Denny Vrandečić

That is really cool! Thanks and congratulations! I will certainly play with
it.

Is it in some way synced or is it a static snapshot?

On Tue, Aug 13, 2019 at 4:10 PM Kingsley Idehen 
wrote:

> Hi Everyone,
>
> A little FYI.
>
> We have loaded Wikidata into a Virtuoso instance accessible via SPARQL
> [1]. One benefit is helping to understand Wikidata using our Faceted
> Browsing Interface for Entity Relationship Types [2][3].
>
> Links:
>
> [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint
>
> [2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing Interface
>
> [3] About New York
> 
>
> Enjoy!
>
> Feedback always welcome too :)
>
> --
> Regards,
>
> Kingsley Idehen   
> Founder & CEO
> OpenLink Software
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Ontology in XML

2019-08-12 Thread Denny Vrandečić

The simplest way to get these might be through the SPARQL endpoint (if you
know what data you are looking for) or from the full knowledge base:
https://www.wikidata.org/wiki/Wikidata:Data_access


On Mon, Aug 12, 2019 at 1:32 PM Manzoor Ali  wrote:

> Thanks. actually i need the ontology which have all the predicate defined.
>
> On Mon, 12 Aug 2019 at 22:23, Denny Vrandečić  wrote:
>
>> Maybe you mean this file:
>>
>>
>> https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Wikibase/+/master/docs/ontology.owl
>>
>>
>> Cheers,
>> Denny
>>
>> On Sun, Aug 11, 2019 at 7:20 AM Sebastian Hellmann <
>> hellm...@informatik.uni-leipzig.de> wrote:
>>
>>> Hi Ali, all,
>>>
>>> we have this dataset:
>>> https://databus.dbpedia.org/dbpedia/wikidata/instance-types/2018.10.20
>>> and an ontology with some Wikidata links:
>>> https://databus.dbpedia.org/dbpedia/ontology/dbo-snapshots/2019.02.21T08.00.00Z
>>>
>>> The owl version is XML.
>>>
>>> It is true that there is no intention to make a Wikidata ontology.
>>> Nevertheless, we were wondering whether Wikidata couldn't just load
>>> DBpedia's model. We can bot import it, easily. I am sure this would help to
>>> query Wikidata.
>>>
>>> Cleaning up the P31's and P279's is quite tedious, if done individually.
>>>
>>> -- Sebastian
>>>
>>>
>>>
>>> On 10.08.19 19:18, Marijane White wrote:
>>>
>>> Perhaps someone can correct me if I am wrong, but I am under the
>>> impression that such a thing doesn’t exist and that Wikidata’s models are
>>> intentionally not documented as an ontology.  I gathered this understanding
>>> from Bob DuCharme’s blog post about extracting RDF models from Wikidata
>>> with SPARQL queries:
>>> http://www.bobdc.com/blog/extracting-rdf-data-models-fro/
>>>
>>>
>>>
>>>
>>>
>>> *Marijane White, M.S.L.I.S.*
>>>
>>> Data Librarian, Assistant Professor
>>>
>>> Oregon Health & Science University Library
>>>
>>>
>>>
>>> *Phone*: 503.494.3484
>>>
>>> *Email*: whi...@ohsu.edu
>>>
>>> *ORCiD*: https://orcid.org/-0001-5059-4132
>>>
>>>
>>>
>>>
>>>
>>> *From: *Wikidata 
>>>  on behalf of Manzoor Ali
>>>  
>>> *Reply-To: *Discussion list for the Wikidata project
>>>  
>>> *Date: *Saturday, August 10, 2019 at 2:38 AM
>>> *To: *"wikidata@lists.wikimedia.org" 
>>>  
>>> *Subject: *[Wikidata] Ontology in XML
>>>
>>>
>>>
>>>
>>> Hello Wikidata,
>>>
>>> Sorry in advance if I am using wrong mail. I need Wikidata ontology in
>>> XML form. can you please tell me that from which link I can download it.
>>> Thanks in advance.
>>>
>>>
>>>
>>> ___
>>> Wikidata mailing 
>>> listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>> --
>>> All the best,
>>> Sebastian Hellmann
>>>
>>> Director of Knowledge Integration and Linked Data Technologies (KILT)
>>> Competence Center
>>> at the Institute for Applied Informatics (InfAI) at Leipzig University
>>> Executive Director of the DBpedia Association
>>> Projects: http://dbpedia.org, http://nlp2rdf.org,
>>> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
>>> <http://www.w3.org/community/ld4lt>
>>> Homepage: http://aksw.org/SebastianHellmann
>>> Research Group: http://aksw.org
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>
>
> --
>
> <https://about.me/manxoorali?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
> Manzoor Ali
> about.me/manxoorali
> <https://about.me/manxoorali?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Ontology in XML

2019-08-12 Thread Denny Vrandečić

Maybe you mean this file:

https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Wikibase/+/master/docs/ontology.owl


Cheers,
Denny

On Sun, Aug 11, 2019 at 7:20 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Hi Ali, all,
>
> we have this dataset:
> https://databus.dbpedia.org/dbpedia/wikidata/instance-types/2018.10.20
> and an ontology with some Wikidata links:
> https://databus.dbpedia.org/dbpedia/ontology/dbo-snapshots/2019.02.21T08.00.00Z
>
> The owl version is XML.
>
> It is true that there is no intention to make a Wikidata ontology.
> Nevertheless, we were wondering whether Wikidata couldn't just load
> DBpedia's model. We can bot import it, easily. I am sure this would help to
> query Wikidata.
>
> Cleaning up the P31's and P279's is quite tedious, if done individually.
>
> -- Sebastian
>
>
>
> On 10.08.19 19:18, Marijane White wrote:
>
> Perhaps someone can correct me if I am wrong, but I am under the
> impression that such a thing doesn’t exist and that Wikidata’s models are
> intentionally not documented as an ontology.  I gathered this understanding
> from Bob DuCharme’s blog post about extracting RDF models from Wikidata
> with SPARQL queries:
> http://www.bobdc.com/blog/extracting-rdf-data-models-fro/
>
>
>
>
>
> *Marijane White, M.S.L.I.S.*
>
> Data Librarian, Assistant Professor
>
> Oregon Health & Science University Library
>
>
>
> *Phone*: 503.494.3484
>
> *Email*: whi...@ohsu.edu
>
> *ORCiD*: https://orcid.org/-0001-5059-4132
>
>
>
>
>
> *From: *Wikidata 
>  on behalf of Manzoor Ali
>  
> *Reply-To: *Discussion list for the Wikidata project
>  
> *Date: *Saturday, August 10, 2019 at 2:38 AM
> *To: *"wikidata@lists.wikimedia.org" 
>  
> *Subject: *[Wikidata] Ontology in XML
>
>
>
>
> Hello Wikidata,
>
> Sorry in advance if I am using wrong mail. I need Wikidata ontology in XML
> form. can you please tell me that from which link I can download it. Thanks
> in advance.
>
>
>
> ___
> Wikidata mailing 
> listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
>
> --
> All the best,
> Sebastian Hellmann
>
> Director of Knowledge Integration and Linked Data Technologies (KILT)
> Competence Center
> at the Institute for Applied Informatics (InfAI) at Leipzig University
> Executive Director of the DBpedia Association
> Projects: http://dbpedia.org, http://nlp2rdf.org,
> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
> 
> Homepage: http://aksw.org/SebastianHellmann
> Research Group: http://aksw.org
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] "Wikidata item" link to be moved in the menu column on Wikimedia projects

2019-08-08 Thread Denny Vrandečić

Thank you for the message, Lea, this seems like a good step.

On Thu, Aug 8, 2019 at 8:19 AM Gerard Meijssen 
wrote:

> Hoi,
> Easy, my user interface is English in all of them.
> Thanks,
>   GerardN
>
> On Thu, 8 Aug 2019 at 16:39, Imre Samu  wrote:
>
>> *> Suggestion* display the Q number in the link i.e. the user doesnt
>> have to click the link to see the Q  number
>>
>> +1
>> And sometimes extreme hard to find the wikidata link,
>>
>> in my case:
>> - if you don't know the letters of different languages, and you want
>> cleaning the duplicated wikidata_id-s
>>
>> Quiz -  try to find the Wikidata id ( Link )  as fast as you can:
>> - https://fa.wikipedia.org/wiki/%D8%B3%DA%AF%D8%AF
>> - https://zh.wikipedia.org/wiki/%E5%A1%9E%E6%A0%BC%E5%BE%B7
>> -
>> https://th.wikipedia.org/wiki/%E0%B9%81%E0%B8%8B%E0%B9%81%E0%B8%81%E0%B9%87%E0%B8%94
>>
>> Regards,
>> Imre
>>
>> Magnus Sälgö  ezt írta (időpont: 2019. aug. 8., Cs,
>> 13:49):
>>
>>> *Suggestion* display the Q number in the link i.e. the user doesnt have
>>> to click the link to see the Q  number
>>>
>>> *Motivation:* more and more institution start use the Wikidata Qnumber
>>> and as we today display VIAF numbers, LCCN, GND numbers etc. for authority
>>> data I think we should make it easier to see that this WIkipedia article
>>> has a specific Q number
>>>
>>> Regards
>>> Magnus Sälgö
>>> Stockholm, Sweden
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Generating EntitySchema from Wikidata Lexeme Forms data

2019-07-24 Thread Denny Vrandečić

Lucas, what would you consider the canonical representation of the language
knowledge - the one in the API or the one on the Wiki pages?

On Wed, Jul 24, 2019 at 5:34 AM Lucas Werkmeister 
wrote:

> Just a side note – if anyone wants to use the templates, it’s probably
> better to use the tool’s templates API
> <https://www.wikidata.org/wiki/Wikidata:Wikidata_Lexeme_Forms#Templates_API>
> rather than the wiki page itself: transcribing the templates into
> structured form takes some time, there’s no need for someone else to do it
> again :)
>
> Cheers,
> Lucas
> On 24.07.19 02:03, Denny Vrandečić wrote:
>
> Hey,
>
> is anyone working on or has worked on generating EntitySchemas from the
> Wikidata Lexeme Forms data that Lucas is collecting?
>
> It seems that most of the necessary data should be there already for these.
>
> E.g. generating
>
> https://www.wikidata.org/wiki/EntitySchema:E34
>
> from
>
> https://www.wikidata.org/wiki/Wikidata:Wikidata_Lexeme_Forms/German
>
> (If Danish and German were the same language, which they are not,
> obviously, but this is to exemplify the idea).
>
> If not, does anyone want to work / cooperate on that?
>
> Cheers,
> Denny
>
>
> ___
> Wikidata mailing 
> listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Generating EntitySchema from Wikidata Lexeme Forms data

2019-07-23 Thread Denny Vrandečić

Hey,

is anyone working on or has worked on generating EntitySchemas from the
Wikidata Lexeme Forms data that Lucas is collecting?

It seems that most of the necessary data should be there already for these.

E.g. generating

https://www.wikidata.org/wiki/EntitySchema:E34

from

https://www.wikidata.org/wiki/Wikidata:Wikidata_Lexeme_Forms/German

(If Danish and German were the same language, which they are not,
obviously, but this is to exemplify the idea).

If not, does anyone want to work / cooperate on that?

Cheers,
Denny
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] "Collaborating on the sum of all knowledge across languages"

2019-07-06 Thread Denny Vrandečić

Hi all!

I really try not to spam the chat too much with pointers to my work on the
Abstract Wikipedia, but this one is probably also interesting for Wikidata
contributors. It is the draft for a chapter submitted to Koerner and
Reagle's Wikipedia@20 book, and talks about knowledge diversity under the
light of centralisation through projects such as Wikidata.

Public commenting phase is open until July 19, and very welcome:
"Collaborating on the sum of all knowledge across languages"

About the book: https://meta.wikimedia.org/wiki/Wikipedia@20
Link to chapter: https://wikipedia20.pubpub.org/pub/vyf7ksah

Cheers,
Denny
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [Wikipedia-l] Fwd: [Wikimedia-l] Wikipedia in an abstract language

2019-01-15 Thread Denny Vrandečić

Cool, thanks! I read this a while ago, rereading again.

On Tue, Jan 15, 2019 at 3:28 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Hi all,
>
> let me send you a paper from 2013, which might either help directly or at
> least to get some ideas...
>
> A lemon lexicon for DBpedia, Christina Unger, John McCrae, Sebastian
> Walter, Sara Winter, Philipp Cimiano, 2013, Proceedings of 1st
> International Workshop on NLP and DBpedia, co-located with the 12th
> International Semantic Web Conference (ISWC 2013), October 21-25, Sydney,
> Australia
>
> https://github.com/ag-sc/lemon.dbpedia
>
> https://pdfs.semanticscholar.org/638e/b4959db792c94411339439013eef536fb052.pdf
>
> Since the mappings from DBpedia to Wikidata properties are here:
> http://mappings.dbpedia.org/index.php?title=Special:AllPages&namespace=202
> e.g. http://mappings.dbpedia.org/index.php/OntologyProperty:BirthDate
>
> You could directly use the DBpedia-lemon lexicalisation for Wikidata.
>
> The mappings can be downloaded with
>
> git clone https://github.com/dbpedia/extraction-framework ; cd core ;
> ../run download-mappings
>
>
> All the best,
>
> Sebastian
>
>
>
>
> On 14.01.19 18:34, Denny Vrandečić wrote:
>
> Felipe,
>
> thanks for the kind words.
>
> There are a few research projects that use Wikidata to generate parts of
> Wikipedia articles - see for example https://arxiv.org/abs/1702.06235 which
> is almost as good as human results and beats templates by far, but only for
> the first sentence of biographies.
>
> Lucie Kaffee has also quite a body of research on that topic, and has
> worked very succesfully and tightly with some Wikipedia communities on
> these questions. Here's her bibliography:
> https://scholar.google.com/citations?user=xiuGTq0J&hl=de
>
> Another project of hers is currently under review for a grant:
> https://meta.wikimedia.org/wiki/Grants:Project/Scribe:_Supporting_Under-resourced_Wikipedia_Editors_in_Creating_New_Articles
> - I would suggest to take a look and if you are so inclined to express
> support. It is totally worth it!
>
> My opinion is that these projects are great for starters, and should be
> done (low-hanging fruits and all that), but won't get much further at least
> for a while, mostly because Wikidata rarely offers more than a skeleton of
> content. A decent Wikipedia article will include much, much more content
> than what is represented in Wikidata. And if you only use that for input,
> you're limiting yourself too much.
>
> Here's a different approach based on summarization over input sources:
> https://www.wired.com/story/using-artificial-intelligence-to-fix-wikipedias-gender-problem/
>  -
> this has a more promising approach for the short- to mid-term.
>
> I still maintain that the Abstract Wikipedia approach has certain
> advantages over both learned approaches, and is most aligned with Lucie's
> work. The machine learned approaches always fall short on the dimension of
> editability, due to the black-boxness of their solutions.
>
> Also, furthermore, agree to Jeblad.
>
> Remains the question, why is there not more discussion? Maybe because
> there is nothing substantial to discuss yet :) The two white papers are
> rather high level and the idea is not concrete enough yet, so that I
> wouldn't expect too much discussion yet going on on-wiki. That was similar
> to Wikidata - the number who discussed Wikidata at this level of maturity
> was tiny, it increased considerably once an actual design plan was
> suggested, but still remained small - and then exploded once the system was
> deployed. I would be surprised and delighted if we managed to avoid this
> pattern this time, but I can't do more than publicly present the idea,
> announce plans once they are there, and hope for a timely discussion :)
>
> Cheers,
> Denny
>
>
> On Mon, Jan 14, 2019 at 2:54 AM John Erling Blad  wrote:
>
>> An additional note; what Wikipedia urgently needs is a way to create
>> and reuse canned text (aka "templates"), and a way to adapt that text
>> to data from Wikidata. That is mostly just inflection rules, but in
>> some cases it involves grammar rules. To create larger pieces of text
>> is much harder, especially if the text is supposed to be readable.
>> Jumbling sentences together as is commonly done by various botscripts
>> does not work very well, or rather, it does not work at all.
>>
>> On Mon, Jan 14, 2019 at 11:44 AM John Erling Blad 
>> wrote:
>> >
>> > Using an abstract language as an basis for translations have been
>> > tried before, and is almost as hard as translat

Re: [Wikidata] [Wikipedia-l] Fwd: [Wikimedia-l] Wikipedia in an abstract language

2019-01-14 Thread Denny Vrandečić

Felipe,

thanks for the kind words.

There are a few research projects that use Wikidata to generate parts of
Wikipedia articles - see for example https://arxiv.org/abs/1702.06235 which
is almost as good as human results and beats templates by far, but only for
the first sentence of biographies.

Lucie Kaffee has also quite a body of research on that topic, and has
worked very succesfully and tightly with some Wikipedia communities on
these questions. Here's her bibliography:
https://scholar.google.com/citations?user=xiuGTq0J&hl=de

Another project of hers is currently under review for a grant:
https://meta.wikimedia.org/wiki/Grants:Project/Scribe:_Supporting_Under-resourced_Wikipedia_Editors_in_Creating_New_Articles
- I would suggest to take a look and if you are so inclined to express
support. It is totally worth it!

My opinion is that these projects are great for starters, and should be
done (low-hanging fruits and all that), but won't get much further at least
for a while, mostly because Wikidata rarely offers more than a skeleton of
content. A decent Wikipedia article will include much, much more content
than what is represented in Wikidata. And if you only use that for input,
you're limiting yourself too much.

Here's a different approach based on summarization over input sources:
https://www.wired.com/story/using-artificial-intelligence-to-fix-wikipedias-gender-problem/
-
this has a more promising approach for the short- to mid-term.

I still maintain that the Abstract Wikipedia approach has certain
advantages over both learned approaches, and is most aligned with Lucie's
work. The machine learned approaches always fall short on the dimension of
editability, due to the black-boxness of their solutions.

Also, furthermore, agree to Jeblad.

Remains the question, why is there not more discussion? Maybe because there
is nothing substantial to discuss yet :) The two white papers are rather
high level and the idea is not concrete enough yet, so that I wouldn't
expect too much discussion yet going on on-wiki. That was similar to
Wikidata - the number who discussed Wikidata at this level of maturity was
tiny, it increased considerably once an actual design plan was suggested,
but still remained small - and then exploded once the system was deployed.
I would be surprised and delighted if we managed to avoid this pattern this
time, but I can't do more than publicly present the idea, announce plans
once they are there, and hope for a timely discussion :)

Cheers,
Denny

On Mon, Jan 14, 2019 at 2:54 AM John Erling Blad  wrote:

> An additional note; what Wikipedia urgently needs is a way to create
> and reuse canned text (aka "templates"), and a way to adapt that text
> to data from Wikidata. That is mostly just inflection rules, but in
> some cases it involves grammar rules. To create larger pieces of text
> is much harder, especially if the text is supposed to be readable.
> Jumbling sentences together as is commonly done by various botscripts
> does not work very well, or rather, it does not work at all.
>
> On Mon, Jan 14, 2019 at 11:44 AM John Erling Blad 
> wrote:
> >
> > Using an abstract language as an basis for translations have been
> > tried before, and is almost as hard as translating between two common
> > languages.
> >
> > There are two really hard problems, it is the implied references and
> > the cultural context. An artificial language can get rid of the
> > implied references, but it tend to create very weird and unnatural
> > expressions. If the cultural context is removed, then it can be
> > extremely hard to put it back in, and without any cultural context it
> > can be hard to explain anything.
> >
> > But yes, you can make an abstract language, but it won't give you any
> > high quality prose.
> >
> > On Mon, Jan 14, 2019 at 8:09 AM Felipe Schenone 
> wrote:
> > >
> > > This is quite an awesome idea. But thinking about it, wouldn't it be
> possible to use structured data in wikidata to generate articles? Can't we
> skip the need of learning an abstract language by using wikidata?
> > >
> > > Also, is there discussion about this idea anywhere in the Wikimedia
> wikis? I haven't found any...
> > >
> > > On Sat, Sep 29, 2018 at 3:44 PM Pine W  wrote:
> > >>
> > >> Forwarding because this (ambitious!) proposal may be of interest to
> people
> > >> on other lists. I'm not endorsing the proposal at this time, but I'm
> > >> curious about it.
> > >>
> > >> Pine
> > >> ( https://meta.wikimedia.org/wiki/User:Pine )
> > >>
> > >>
> > >> -- Forwarded message -
> > >> From: Den

Re: [Wikidata] [Wikimedia-l] Solve legal uncertainty of Wikidata

2018-05-18 Thread Denny Vrandečić

Thank you for your answer, Sebastian.

Publishing the Gutachten would be fantastic! That would be very helpful and
deeply appreciated.

Regarding the relicensing, I agree with you. You can just go and do that,
and given that you ask for attribution to DBpedia, and not to Wikipedia, I
would claim that's what you're doing. And I think that's fine.

Regarding attribution, commonly it is assumed that you have to respect it
transitively. That is one of the reasons a license that requires BY sucks
so hard for data: unlike with text, the attribution requirements grow very
quickly. It is the same as with modified images and collages: it is not
sufficient to attribute the last author, but all contributors have to be
attributed.

This is why I think that whoever wants to be part of a large federation of
data on the web, should publish under CC0.

That is very different from licensing texts or images. But for data
anything else is just weird and will bite is in the long run more than we
might ever benefit.

So, just to say it again: if the Gutachten you mentioned could be made
available, that would be very very awesome!

Thank you, Denny



On Thu, May 17, 2018, 23:06 Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Hi Denny,
>
> On 18.05.2018 02:54, Denny Vrandečić wrote:
>
> Rob Speer wrote:
> > The result of this, by the way, is that commercial entities sell modified
> > versions of Wikidata with impunity. It undermines the terms of other
> > resources such as DBPedia, which also contains facts extracted from
> > Wikipedia and respects its Share-Alike terms. Why would anyone use
> DBPedia
> > and have to agree to share alike, when they can get similar data from
> > Wikidata which promises them it's CC-0?
>
> The comparison to DBpedia is interesting: the terms for DBpedia state
> "Attribution in this case means keep DBpedia URIs visible and active
> through at least one (preferably all) of @href, , or "Link:". If
> live links are impossible (e.g., when printed on paper), a textual
> blurb-based attribution is acceptable."
> http://wiki.dbpedia.org/terms-imprint
>
> So according to these terms, when someone displays data from DBpedia, it
> is entirely sufficient to attribute DBpedia.
>
> What that means is that DBpedia follows exactly the same theory as
> Wikidata: it is OK to extract data from Wikipedia and republish it as your
> own dataset under your own copyright without requiring attribution to the
> original source of the extraction.
>
> (A bit more problematic might be the fact that DBpedia also republishes
> whole paragraphs of Text under these terms, but that's another story)
>
>
> My understanding is that all that Wikidata has extracted from Wikipedia is
> non-copyrightable in the first place and thus republishing it under a
> different license (or, as in the case of DBpedia for simple triples, with a
> different attribution) is legally sound.
>
>
> In the SmartDataWeb project https://www.smartdataweb.de/ we hired lawyers
> to write a legal review about the extraction situation. Facts can be
> extracted and republished under CC-0 without problem as is the case of
> infoboxes.. Copying a whole database is a different because database rights
> hold. If you only extract ~ two sentences it falls under citation, which is
> also easy. If it is more than two sentence, then copyright applies.
>
> I can check whether it is ready and shareable. The legal review
> (Gutachten) is quite a big thing as it has some legal relevancy and can be
> cited in court.
>
> Hence we can switch to ODC-BY with facts as CC-0 and the text as
> share-alike. However the attribution mentioned in the imprint is still
> fine, since it is under database and not the content/facts.
> I am still uncertain about the attribution. If you remix and publish you
> need to cite the direct sources. But if somebody takes from you, does he
> only attribute to you or to everybody you used in a transitive way.
>
> Anyhow, we are sharpening the whole model towards technology, not
> data/content. So the databus will be a transparent layer and it is much
> easier to find the source like Wikipedia and Wikidata and do contributions
> there, which is actually one of the intentions of share-alike (getting work
> pushed back/upstream).
>
> All the best,
> Sebastian
>
>
> If there is disagreement with that, I would be interested which content
> exactly is considered to be under copyright and where license has not been
> followed on Wikidata.
>
> For completion: the discussion is going on in parallel on the Wikidata
> project chat and in Phabricator:
>
> https://phabricator.wikimedia.org/T193728#4212728
>
> https://www.wikidata.org/wiki/Wikidata:Project

Re: [Wikidata] [Wikimedia-l] Solve legal uncertainty of Wikidata

2018-05-17 Thread Denny Vrandečić

Rob Speer wrote:
> The result of this, by the way, is that commercial entities sell modified
> versions of Wikidata with impunity. It undermines the terms of other
> resources such as DBPedia, which also contains facts extracted from
> Wikipedia and respects its Share-Alike terms. Why would anyone use DBPedia
> and have to agree to share alike, when they can get similar data from
> Wikidata which promises them it's CC-0?

The comparison to DBpedia is interesting: the terms for DBpedia state
"Attribution in this case means keep DBpedia URIs visible and active
through at least one (preferably all) of @href, , or "Link:". If
live links are impossible (e.g., when printed on paper), a textual
blurb-based attribution is acceptable."
http://wiki.dbpedia.org/terms-imprint

So according to these terms, when someone displays data from DBpedia, it is
entirely sufficient to attribute DBpedia.

What that means is that DBpedia follows exactly the same theory as
Wikidata: it is OK to extract data from Wikipedia and republish it as your
own dataset under your own copyright without requiring attribution to the
original source of the extraction.

(A bit more problematic might be the fact that DBpedia also republishes
whole paragraphs of Text under these terms, but that's another story)

My understanding is that all that Wikidata has extracted from Wikipedia is
non-copyrightable in the first place and thus republishing it under a
different license (or, as in the case of DBpedia for simple triples, with a
different attribution) is legally sound.

If there is disagreement with that, I would be interested which content
exactly is considered to be under copyright and where license has not been
followed on Wikidata.

For completion: the discussion is going on in parallel on the Wikidata
project chat and in Phabricator:

https://phabricator.wikimedia.org/T193728#4212728
https://www.wikidata.org/wiki/Wikidata:Project_chat#Wikipedia_and_other_Wikimedia_projects

I would appreciate if we could keep the discussion in a single place.

Gnom1 on Phabricator has offered to actually answer legal questions, but we
need to come up with the questions that we want to ask. If it should be,
for example, as Rob Speer states on the bug, "has the copyright of
interwiki links been breached by having them be moved to Wikidata?", I'd be
quite happy with that question - if that's the disagreement, let us ask
Legal help and see if my understanding or yours is correct.

Does this sound like a reasonable question? Or which other question would
you like to ask instead?

On Thu, May 17, 2018 at 4:15 PM Rob Speer  wrote:

> > As always, copyright is predatory. As we can prove that copyright is the
> enemy of science and knowledge
>
> Well, this kind of gets to the heart of the issue, doesn't it.
>
> I support the Creative Commons license, including the share-alike term,
> which requires copyright in order to work, and I've contributed to multiple
> Wikimedia projects with the understanding that my work would be protected
> by CC-By-SA.
>
> Wikidata is engaged in a project-wide act of disobedience against CC-By-SA.
> I would say that GerardM has provided an excellent summary of the attitude
> toward Creative Commons that I've encountered on Wikidata: "it's holding us
> back", "it's the enemy", "you can't copyright knowledge", "you can't make
> us follow it", etc.
>
> The result of this, by the way, is that commercial entities sell modified
> versions of Wikidata with impunity. It undermines the terms of other
> resources such as DBPedia, which also contains facts extracted from
> Wikipedia and respects its Share-Alike terms. Why would anyone use DBPedia
> and have to agree to share alike, when they can get similar data from
> Wikidata which promises them it's CC-0?
>
> On Wed, 16 May 2018 at 21:43 Gerard Meijssen 
> wrote:
>
> > Hoi,
> > Thank you for the overly broad misrepresentation. As always, copyright is
> > predatory. As we can prove that copyright is the enemy of science and
> > knowledge we should not be upset that *copyright *is abused we should
> > welcome it as it proves the point. Also when we use texts from everywhere
> > and rephrase it in Wikipedia articles "we" are not lily white either.
> >
> > In "them old days" generally we felt that when people would use
> Wikipedia,
> > it would only serve our purpose; share the sum of all knowledge. I still
> > feel really good about that. And, it has been shown that what we do;
> > maintain / curate / update that data that it is not easily given to do as
> > well as "we" do it.
> >
> > When we are to be more precise with our copyright, there are a few things
> > we could do to make copyright more transparent. When data is to be
> uploaded
> > (Commons / Wikipedia or Wikidata) we should use a user that is OWNED and
> > operated by the copyright holder. The operation may be by proxy and as a
> > consequence there is no longer a question about copyright as the
> copyright
> > holder can do as we wants.

Re: [Wikidata] Wikiata and the LOD cloud

2018-05-07 Thread Denny Vrandečić

Thanks!

On Mon, May 7, 2018 at 1:36 PM Lucas Werkmeister <
lucas.werkmeis...@wikimedia.de> wrote:

> Folks, I’m already in contact with John, there’s no need to contact him
> again :)
>
> Cheers, Lucas
>
> Am Mo., 7. Mai 2018 um 19:32 Uhr schrieb Denny Vrandečić <
> vrande...@gmail.com>:
>
>> Well, then, we have tried several times to get into that diagram, and it
>> never worked out.
>>
>> So, given the page you linke, it says:
>>
>> Contributing to the Diagram
>>
>> First, make sure that you publish data according to the Linked Data
>> principles <http://www.w3.org/DesignIssues/LinkedData.html>. We
>> interpret this as:
>>
>>- There must be *resolvable http:// (or https://) URIs*.
>>- They must resolve, with or without content negotiation, to *RDF
>>data* in one of the popular RDF formats (RDFa, RDF/XML, Turtle,
>>N-Triples).
>>- The dataset must contain *at least 1000 triples*. (Hence, your FOAF
>>file most likely does not qualify.)
>>- The dataset must be connected via *RDF links* to a dataset that is
>>already in the diagram. This means, either your dataset must use URIs from
>>the other dataset, or vice versa. We arbitrarily require at least 50 
>> links.
>>- Access of the *entire* dataset must be possible via *RDF crawling*,
>>via an *RDF dump*, or via a *SPARQL endpoint*.
>>
>> The process for adding datasets is still under development, please
>> contact John P. McCrae  to add a new dataset
>>
>> Wikidata fulfills all the conditions easily. So, here we go, I am adding
>> John to this thread - although I know he already knows about this request -
>> and I am asking officially to enter Wikidata into the LOD diagram.
>>
>> Let's keep it all open, and see where it goes from here.
>>
>> Cheers,
>> Denny
>>
>>
>> On Mon, May 7, 2018 at 4:15 AM Sebastian Hellmann <
>> hellm...@informatik.uni-leipzig.de> wrote:
>>
>>> Hi Denny, Maarten,
>>>
>>> you should read your own emails. In fact it is quite easy to join the
>>> LOD cloud diagram.
>>>
>>> The most important step is to follow the instructions on the page:
>>> http://lod-cloud.net under how to contribute and then add the metadata.
>>>
>>> Some years ago I made a Wordpress with enabled Linked Data:
>>> http://www.klappstuhlclub.de/wp/ Even this is included as I simply
>>> added the metadata entry.
>>>
>>> Do you really think John McCrae added a line in the code that says "if
>>> (dataset==wikidata) skip; " ?
>>>
>>> You just need to add it like everybody else in LOD, DBpedia also created
>>> its entry and updates it now and then. The same accounts for
>>> http://lov.okfn.org  Somebody from Wikidata needs to upload the
>>> Wikidata properties as OWL.  If nobody does it, it will not be in there.
>>>
>>> All the best,
>>>
>>> Sebastian
>>>
>>> On 04.05.2018 18:33, Maarten Dammers wrote:
>>>
>>> It almost feels like someone doesn’t want Wikidata in there? Maybe that
>>> website is maintained by DBpedia fans? Just thinking out loud here because
>>> DBpedia is very popular in the academic world and Wikidata a huge threat
>>> for that popularity.
>>>
>>> Maarten
>>>
>>> Op 4 mei 2018 om 17:20 heeft Denny Vrandečić  het
>>> volgende geschreven:
>>>
>>> I'm pretty sure that Wikidata is doing better than 90% of the current
>>> bubbles in the diagram.
>>>
>>> If they wanted to have Wikidata in the diagram it would have been there
>>> before it was too small to read it. :)
>>>
>>> On Tue, May 1, 2018 at 7:47 AM Peter F. Patel-Schneider <
>>> pfpschnei...@gmail.com> wrote:
>>>
>>>> Thanks for the corrections.
>>>>
>>>> So https://www.wikidata.org/entity/Q42 is *the* Wikidata IRI for
>>>> Douglas
>>>> Adams.  Retrieving from this IRI results in a 303 See Other to
>>>> https://www.wikidata.org/wiki/Special:EntityData/Q42, which (I guess)
>>>> is the
>>>> main IRI for representations of Douglas Adams and other pages with
>>>> information about him.
>>>>
>>>> From https://www.wikidata.org/wiki/Special:EntityData/Q42 content
>>>> negotiation can be used to get the JSON representation (the default),
>>>> other
>>>> representations including Turtle,

Re: [Wikidata] Wikiata and the LOD cloud

2018-05-07 Thread Denny Vrandečić

Well, then, we have tried several times to get into that diagram, and it
never worked out.

So, given the page you linke, it says:

Contributing to the Diagram

First, make sure that you publish data according to the Linked Data
principles <http://www.w3.org/DesignIssues/LinkedData.html>. We interpret
this as:

   - There must be *resolvable http:// (or https://) URIs*.
   - They must resolve, with or without content negotiation, to *RDF data* in
   one of the popular RDF formats (RDFa, RDF/XML, Turtle, N-Triples).
   - The dataset must contain *at least 1000 triples*. (Hence, your FOAF
   file most likely does not qualify.)
   - The dataset must be connected via *RDF links* to a dataset that is
   already in the diagram. This means, either your dataset must use URIs from
   the other dataset, or vice versa. We arbitrarily require at least 50 links.
   - Access of the *entire* dataset must be possible via *RDF crawling*,
   via an *RDF dump*, or via a *SPARQL endpoint*.

The process for adding datasets is still under development, please contact John
P. McCrae  to add a new dataset

Wikidata fulfills all the conditions easily. So, here we go, I am adding
John to this thread - although I know he already knows about this request -
and I am asking officially to enter Wikidata into the LOD diagram.

Let's keep it all open, and see where it goes from here.

Cheers,
Denny


On Mon, May 7, 2018 at 4:15 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Hi Denny, Maarten,
>
> you should read your own emails. In fact it is quite easy to join the LOD
> cloud diagram.
>
> The most important step is to follow the instructions on the page:
> http://lod-cloud.net under how to contribute and then add the metadata.
>
> Some years ago I made a Wordpress with enabled Linked Data:
> http://www.klappstuhlclub.de/wp/ Even this is included as I simply added
> the metadata entry.
>
> Do you really think John McCrae added a line in the code that says "if
> (dataset==wikidata) skip; " ?
>
> You just need to add it like everybody else in LOD, DBpedia also created
> its entry and updates it now and then. The same accounts for
> http://lov.okfn.org  Somebody from Wikidata needs to upload the Wikidata
> properties as OWL.  If nobody does it, it will not be in there.
>
> All the best,
>
> Sebastian
>
> On 04.05.2018 18:33, Maarten Dammers wrote:
>
> It almost feels like someone doesn’t want Wikidata in there? Maybe that
> website is maintained by DBpedia fans? Just thinking out loud here because
> DBpedia is very popular in the academic world and Wikidata a huge threat
> for that popularity.
>
> Maarten
>
> Op 4 mei 2018 om 17:20 heeft Denny Vrandečić  het
> volgende geschreven:
>
> I'm pretty sure that Wikidata is doing better than 90% of the current
> bubbles in the diagram.
>
> If they wanted to have Wikidata in the diagram it would have been there
> before it was too small to read it. :)
>
> On Tue, May 1, 2018 at 7:47 AM Peter F. Patel-Schneider <
> pfpschnei...@gmail.com> wrote:
>
>> Thanks for the corrections.
>>
>> So https://www.wikidata.org/entity/Q42 is *the* Wikidata IRI for Douglas
>> Adams.  Retrieving from this IRI results in a 303 See Other to
>> https://www.wikidata.org/wiki/Special:EntityData/Q42, which (I guess) is
>> the
>> main IRI for representations of Douglas Adams and other pages with
>> information about him.
>>
>> From https://www.wikidata.org/wiki/Special:EntityData/Q42 content
>> negotiation can be used to get the JSON representation (the default),
>> other
>> representations including Turtle, and human-readable information.  (Well
>> actually I'm not sure that this is really correct.  It appears that
>> instead
>> of directly using content negotiation, another 303 See Other is used to
>> provide an IRI for a document in the requested format.)
>>
>> https://www.wikidata.org/wiki/Special:EntityData/Q42.json and
>> https://www.wikidata.org/wiki/Special:EntityData/Q42.ttl are the useful
>> machine-readable documents containing the Wikidata information about
>> Douglas
>> Adams.  Content negotiation is not possible on these pages.
>>
>> https://www.wikidata.org/wiki/Q42 is the IRI that produces a
>> human-readable
>> version of the information about Douglas Adams.  Content negotiation is
>> not
>> possible on this page, but it does have link rel="alternate" to the
>> machine-readable pages.
>>
>> Strangely this page has a link rel="canonical" to itself.  Shouldn't that
>> link be to https://www.wikidata.org/entity/Q42?  There is a human-visible
>> link to this IRI, but there doesn't ap

Re: [Wikidata] Wikiata and the LOD cloud

2018-05-04 Thread Denny Vrandečić

I'm pretty sure that Wikidata is doing better than 90% of the current
bubbles in the diagram.

If they wanted to have Wikidata in the diagram it would have been there
before it was too small to read it. :)

On Tue, May 1, 2018 at 7:47 AM Peter F. Patel-Schneider <
pfpschnei...@gmail.com> wrote:

> Thanks for the corrections.
>
> So https://www.wikidata.org/entity/Q42 is *the* Wikidata IRI for Douglas
> Adams.  Retrieving from this IRI results in a 303 See Other to
> https://www.wikidata.org/wiki/Special:EntityData/Q42, which (I guess) is
> the
> main IRI for representations of Douglas Adams and other pages with
> information about him.
>
> From https://www.wikidata.org/wiki/Special:EntityData/Q42 content
> negotiation can be used to get the JSON representation (the default), other
> representations including Turtle, and human-readable information.  (Well
> actually I'm not sure that this is really correct.  It appears that instead
> of directly using content negotiation, another 303 See Other is used to
> provide an IRI for a document in the requested format.)
>
> https://www.wikidata.org/wiki/Special:EntityData/Q42.json and
> https://www.wikidata.org/wiki/Special:EntityData/Q42.ttl are the useful
> machine-readable documents containing the Wikidata information about
> Douglas
> Adams.  Content negotiation is not possible on these pages.
>
> https://www.wikidata.org/wiki/Q42 is the IRI that produces a
> human-readable
> version of the information about Douglas Adams.  Content negotiation is not
> possible on this page, but it does have link rel="alternate" to the
> machine-readable pages.
>
> Strangely this page has a link rel="canonical" to itself.  Shouldn't that
> link be to https://www.wikidata.org/entity/Q42?  There is a human-visible
> link to this IRI, but there doesn't appear to be any machine-readable link.
>
> RDF links to other IRIs for Douglas Adams are given in RDF pages by
> properties in the wdtn namespace.  Many, but not all, identifiers are
> handled this way.  (Strangely ISNI (P213) isn't even though it is linked on
> the human-readable page.)
>
> So it looks as if Wikidata can be considered as Linked Open Data but maybe
> some improvements can be made.
>
>
> peter
>
>
>
> On 05/01/2018 01:03 AM, Antoine Zimmermann wrote:
> > On 01/05/2018 03:25, Peter F. Patel-Schneider wrote:
> >> As far as I can tell real IRIs for Wikidata are https URIs.  The http
> IRIs
> >> redirect to https IRIs.
> >
> > That's right.
> >
> >>   As far as I can tell no content negotiation is
> >> done.
> >
> > No, you're mistaken. Your tried the URL of a wikipage in your curl
> command.
> > Those are for human consumption, thus not available in turtle.
> >
> > The "real IRIs" of Wikidata entities are like this:
> > https://www.wikidata.org/entity/Q{NUMBER}
> >
> > However, they 303 redirect to
> > https://www.wikidata.org/wiki/Special:EntityData/Q{NUMBER}
> >
> > which is the identifier of a schema:Dataset. Then, if you HTTP GET these
> > URIs, you can content negotiate them to JSON
> > (https://www.wikidata.org/wiki/Special:EntityData/Q{NUMBER}.json) or to
> > turtle (https://www.wikidata.org/wiki/Special:EntityData/Q{NUMBER}.ttl).
> >
> >
> > Suprisingly, there is no connection between the entity IRIs and the
> wikipage
> > URLs. If one was given the IRI of an entity from Wikidata, and had no
> > further information about how Wikidata works, they would not be able to
> > retrieve HTML content about the entity.
> >
> >
> > BTW, I'm not sure the implementation of content negotiation in Wikidata
> is
> > correct because the server does not tell me the format of the resource to
> > which it redirects (as opposed to what DBpedia does, for instance).
> >
> >
> > --AZ
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [Wikimedia-l] An answer to Lydia Pintscher regarding its considerations on Wikidata and CC-0

2017-11-30 Thread Denny Vrandečić

Scott,

The NC license clause is problematic in a number of jurisdictions. For
example, at least in Germany, as I remember from my law classes, it also
would definitively include not-for-profits, NGOs, and even say bloggers,
with or without ads on their sites. One must always be careful in the
choice of a license in order to avoid unintended consequences.

Just food for thought
Denny

On Thu, Nov 30, 2017, 20:51 John Erling Blad  wrote:

> My reference was to in-place discussions at WMDE, not the open meetings
> with Markus. Each week we had an open demo where Markus usually attended.
> As I remember the May-discussion, it was just a discussion in the office,
> there was a reference to an earlier meeting. It is although easy to mix up
> old memories, so what happen first and what happen next should not be taken
> to be facts. If Markus also says the same it is although a reasonable
> chance we have got it right.
>
> As to the questions about archives on open discussions with the community.
> This was in April-May 2012. There was no community, there were only
> concerned individuals. The community started to emerge in August with the
> first attempts to go public. On Wikidata_talk:Introduction there are some
> posts from 15. August 2012,[1] while first post on the subject page is from
> 30. October. The stuff from before October comes from a copy-paste from
> Meta.[3] Note that Denny writes "The data in Wikidata is published under a
> free license, allowing the reuse of the data in many different scenarios."
> but Whittylama changes this to "The data in Wikidata is published under [
> http://creativecommons.org/publicdomain/zero/1.0/ a free license],
> allowing
> the reuse of the data in many different scenarios.",[4] and at that point
> there were a community on an open site and had been for a week. When
> Whittylama did his post it was the 4504th post on the site, so it was
> hardly the first! The license was initially a CC-SA.[8] I'm not quite sure
> when it was changed to CC0 in the footer,[9] but it seems to have happen
> before 31 October 2012, at 19:09. First post on Q1 is from 29. October
> 2012,[5] this is one of several items updated this evening.
>
> It is quite enlightening to start at oldid=1 [6] and stepping forward. You
> will find that our present incarnation went live 25. October 2012. So much
> for the "birthday". To ask for archived community discussions before 25th
> October does not make sense, there were no site, and the only people
> involved were mostly devs posting at Meta. Note for example that the page
> Wikidata:Introduction is from Meta.[7]
>
> [1] https://www.wikidata.org/wiki/Wikidata_talk:Introduction
> [2]
> https://www.wikidata.org/w/index.php?title=Wikidata:Introduction&oldid=2677
> [3]
>
> https://www.wikidata.org/w/index.php?title=Wikidata_talk:Introduction&diff=133569705&oldid=128154617
> [4]
>
> https://www.wikidata.org/w/index.php?title=Wikidata:Introduction&diff=next&oldid=4504
> [5] https://www.wikidata.org/w/index.php?title=Q1&oldid=103
> [6] https://www.wikidata.org/w/index.php?oldid=1
> [7]
>
> https://meta.wikimedia.org/w/index.php?title=Wikidata/Introduction&oldid=4030743
> [8]
>
> https://web.archive.org/web/20121027015501/http://www.wikidata.org/wiki/Wikidata:Main_Page
> [9]
>
> https://web.archive.org/web/20121102074347/http://www.wikidata.org/wiki/Wikidata:Main_Page
>
> On Fri, Dec 1, 2017 at 1:18 AM, Markus Krötzsch <
> mar...@semantic-mediawiki.org> wrote:
>
> > Dear Mathieu,
> >
> > Your post demands my response since I was there when CC0 was first chosen
> > (i.e., in the April meeting). I won't discuss your other claims here --
> the
> > discussions on the Wikidata list are already doing this, and I agree with
> > Lydia that no shouting is necessary here.
> >
> > Nevertheless, I must at least testify to what John wrote in his earlier
> > message (quote included below this email for reference): it was not
> Denny's
> > decision to go for CC0, but the outcome of a discussion among several
> > people who had worked with open data for some time before Wikidata was
> > born. I have personally supported this choice and still do. I have never
> > received any money directly or indirectly from Google, though -- full
> > disclosure -- I got several T-shirts for supervising in Summer of Code
> > projects.
> >
> > At no time did Google or any other company take part in our discussions
> in
> > the zeroth hour of Wikidata. And why should they? From what I can see on
> > their web page, Google has no problem with all kinds of different license
> > terms in the data they display. Also, I can tell you that we would have
> > reacted in a very allergic way to such attempts, so if any company had
> > approached us, this would quite likely have backfired. But, believe it or
> > not, when we started it was all but clear that this would become a
> relevant
> > project at all, and no major company even cared to lobby us. It was still
> > mostly a few hackers getting togethe

Re: [Wikidata] Coordinate precision in Wikidata, RDF & query service

2017-08-31 Thread Denny Vrandečić

And thanks for the use cases. This helps a lot with thinking about this.

On Thu, Aug 31, 2017, 16:31 Denny Vrandečić  wrote:

> The reason why we save the actual value with more digits than the
> precision (and why we keep the precision as an explicit value at all) is
> because the value could be entered and displayed either as decimal digits
> or in minutes and seconds. So internally one would save 20' as 0.3,
> but the precision is still just 2. This allows to roundtrip.
>
> I hope that makes any sense?
>
> Yes, that means that using the values for comparison without taking the
> prevision into account will fail.
>
> I don't think comparison and other operators were ever specified for the
> datetypes. This has bitten us before, and I think it would be valuable to
> do. And that would resolve these issues, and some others.
>
> Would there be people interested in doing that? I sure would love to get
> it right.
>
> On Thu, Aug 31, 2017, 16:17 Stas Malyshev  wrote:
>
>> Hi!
>>
>> > I am not sure I understand the issue and what the suggestion is to solve
>> > it. If we decide to arbitrarily reduce the possible range for the
>>
>> Well, there are actually several issues right now.
>>
>> 1. Our RDF output produces coordinates with more digits that specified
>> precision of the actual value.
>> 2. Our precision values as specified in wikibase:geoPrecision seem to
>> make little sense.
>> 3. We may represent the same coordinates for objects located in the same
>> place as different ones because of precision values being kinda chaotic.
>> 4. We may have different data from other databases because our
>> coordinate is over-precise.
>>
>> (1) is probably easiest to fix. (2) is a bit harder, and I am still not
>> sure how wikibase:geoPrecision is used, if at all.
>> (3) and (4) are less important, but it would be nice to improve, and
>> maybe they will be mostly fixed once (1) and (2) are fixed. But before
>> approaching the fix, I wanted to understand what are expectations from
>> precision and if there can or should be some limits there. Technically,
>> it doesn't matter too much - except that some formulae for distances do
>> not work well for high precisions because of limited accuracy of 64-bit
>> double, but there are ways around it. So technically we can keep 9
>> digits or however many we need, if we wanted to. I just wanted to see if
>> we should.
>>
>> --
>> Stas Malyshev
>> smalys...@wikimedia.org
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Coordinate precision in Wikidata, RDF & query service

2017-08-31 Thread Denny Vrandečić

The reason why we save the actual value with more digits than the precision
(and why we keep the precision as an explicit value at all) is because the
value could be entered and displayed either as decimal digits or in minutes
and seconds. So internally one would save 20' as 0.3, but the
precision is still just 2. This allows to roundtrip.

I hope that makes any sense?

Yes, that means that using the values for comparison without taking the
prevision into account will fail.

I don't think comparison and other operators were ever specified for the
datetypes. This has bitten us before, and I think it would be valuable to
do. And that would resolve these issues, and some others.

Would there be people interested in doing that? I sure would love to get it
right.

On Thu, Aug 31, 2017, 16:17 Stas Malyshev  wrote:

> Hi!
>
> > I am not sure I understand the issue and what the suggestion is to solve
> > it. If we decide to arbitrarily reduce the possible range for the
>
> Well, there are actually several issues right now.
>
> 1. Our RDF output produces coordinates with more digits that specified
> precision of the actual value.
> 2. Our precision values as specified in wikibase:geoPrecision seem to
> make little sense.
> 3. We may represent the same coordinates for objects located in the same
> place as different ones because of precision values being kinda chaotic.
> 4. We may have different data from other databases because our
> coordinate is over-precise.
>
> (1) is probably easiest to fix. (2) is a bit harder, and I am still not
> sure how wikibase:geoPrecision is used, if at all.
> (3) and (4) are less important, but it would be nice to improve, and
> maybe they will be mostly fixed once (1) and (2) are fixed. But before
> approaching the fix, I wanted to understand what are expectations from
> precision and if there can or should be some limits there. Technically,
> it doesn't matter too much - except that some formulae for distances do
> not work well for high precisions because of limited accuracy of 64-bit
> double, but there are ways around it. So technically we can keep 9
> digits or however many we need, if we wanted to. I just wanted to see if
> we should.
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Coordinate precision in Wikidata, RDF & query service

2017-08-31 Thread Denny Vrandečić

The 9 digits precision was based on a survey of Wikipedia we did back then
and the most precise GPS coordinates in Wikipedia. Unfortunately, I don't
remember anymore what article it was - it was some article listing a number
of places that have, due to whatever reason - really high precisions. If
someone finds the article again, I would be thankful. It might help in this
conversation. The 9 digits were not chosen arbitrarily, but based on the
requirements from Wikipedia.

But, this is just the most detailed precision. In most cases, as you
notice, we won't need this high precision, but we will have a much lower
precision. Pinning down a municipality with a 9 digit precision is
obviously nonsense. For most countries, any precision beyond 0 seems quite
nonsensical.

But that's also true for time. The time data model allows to
second-precision, but obviously, for much of the data that does not make
sense. Nevertheless, the datamodel supports saving it, we don't want to
loose here compared to the base data.

I am not sure I understand the issue and what the suggestion is to solve
it. If we decide to arbitrarily reduce the possible range for the
precision, this still won't lead to any improvements for countries as
compared to statues. As far as I can tell, the only way to actually solve
this is to provide query patterns that take the precision into account and
to have the system implement it correctly.

On Thu, Aug 31, 2017 at 2:38 PM Stas Malyshev 
wrote:

> Hi!
>
> > I think that should be 5 decimals for commercial GPS, per that link?
> > It also suggests that "The sixth decimal place is worth up to 0.11 m:
> > you can use this for laying out structures in detail, for designing
> > landscapes, building roads. It should be more than good enough for
> > tracking movements of glaciers and rivers. This can be achieved by
> > taking painstaking measures with GPS, such as differentially corrected
> > GPS."
> >
>
> This does not seem to be typical (or recommended) use case for Wikidata.
> If you need to build a road, you better have some GIS database beyond
> Wikidata I think :)
>
> > Do we hope to store datasets around glacier movement? It seems
> > possible. (We don't seem to currently
> > https://www.wikidata.org/wiki/Q770424 )
> >
> > I skimmed a few search results, and found 7 (or 15) decimals given in
> > one standard, but the details are beyond my understanding:
>
> Note that there's a difference between what general GIS standard would
> require (which has much more use cases), what we want to store on
> Wikidata and what we want to use for RDF export and querying. The latter
> is of more concern to me - as overprecision there might actually make
> things a bit harder to work with (such as - "are these two things
> actually the same thing?" or "are they located in the same place?") Of
> course, all those problems are solvable, but why not make it easier?
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wordnet synset ID

2017-08-21 Thread Denny Vrandečić

I think we could ask either Yago or BabelNet or both whether they would be
receptive to release their mappings under a CC0 license, so it can be
integrated into Wikidata. What I wonder is, if they do that, whether we
wanted to have that data or not.

On Mon, Aug 21, 2017 at 7:18 AM Peter F. Patel-Schneider <
pfpschnei...@gmail.com> wrote:

> One problem with BabelNet is that its licence is restrictive, being
> the Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
> license.  Downloading BabelNet is even more restrictive, requiring also
> working at a research institution.
>
> Yago
>
> http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/
> , which has the less restrictive license Attribution 3.0 Unported (CC BY
> 3.0),
> has links between Wikipedia categories and Wordnet.  Unfortunately, it does
> not carry these links through to regular Wikipedia pages.   I've been
> toying
> with making this last connection, which would be easy for those categories
> that a linked to Wikipedia page.
>
> Peter F. Patel-Schneider
> Nuance Communications
>
> PS:  Strangely the Yago logo has a non-commercial license.  I don't know
> why
> this was done.
>
> On 08/15/2017 10:32 AM, Finn Aarup Nielsen wrote:
> >
> > I do not think we have a Wiktionary-wordnet link.
> >
> > But I forgot to write we have a BabelNet Wikidata property,
> > https://www.wikidata.org/wiki/Property:P2581. This property has been
> very
> > little used: http://tinyurl.com/y8npwsm5
> >
> > There might be a Wikimedia-Wordnet indirect link through BabelNet
> >
> > /Finn
> >
> >
> > On 08/15/2017 07:22 PM, Denny Vrandečić wrote:
> >> That's a great question, I have no idea what the answer will turn out
> to be.
> >>
> >> Is there any current link between Wiktionary and WordNet? Or WordNet and
> >> Wikipedia?
> >>
> >>
> >> On Tue, Aug 15, 2017 at 10:14 AM mailto:f...@imm.dtu.dk>>
> wrote:
> >>
> >>
> >>
> >> I have proposed a Wordnet synset property here:
> >>
> https://www.wikidata.org/wiki/Wikidata:Property_proposal/Wordnet_synset_ID
> >>
> >> The property has been discussed here on the mailing list more than a
> >> year ago, but apparently never got to the point of a property
> >> suggestion:
> >>
> https://lists.wikimedia.org/pipermail/wikidata/2016-April/008517.html
> >>
> >> I am wondering how the potential property fits in with the new
> >> development of the Wiktionary-Wikidata link. As far as I see the
> senses,
> >> for instance, at
> http://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L15
> >> link to wikidata-lexeme Q-items, which I suppose is Wikidata Q items
> >> once the new development is put into the production system. So with
> my
> >> understanding linking Wikidata Q-items to Wordnet synsets is
> correct. Is
> >> my understanding correct?
> >>
> >>
> >> Finn Årup Nielsen
> >> http://people.compute.dtu.dk/faan/
> >>
> >>
> >> ___
> >> Wikidata mailing list
> >> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >>
> >>
> >>
> >> ___
> >> Wikidata mailing list
> >> Wikidata@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >>
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wordnet synset ID

2017-08-15 Thread Denny Vrandečić

Does BabelNet have more links than that? Shall we ask Roberto if they would
provide a mapping that Wikidata can ingest?

On Tue, Aug 15, 2017 at 10:34 AM Finn Aarup Nielsen  wrote:

>
> I do not think we have a Wiktionary-wordnet link.
>
> But I forgot to write we have a BabelNet Wikidata property,
> https://www.wikidata.org/wiki/Property:P2581. This property has been
> very little used: http://tinyurl.com/y8npwsm5
>
> There might be a Wikimedia-Wordnet indirect link through BabelNet
>
> /Finn
>
>
> On 08/15/2017 07:22 PM, Denny Vrandečić wrote:
> > That's a great question, I have no idea what the answer will turn out to
> be.
> >
> > Is there any current link between Wiktionary and WordNet? Or WordNet and
> > Wikipedia?
> >
> >
> > On Tue, Aug 15, 2017 at 10:14 AM mailto:f...@imm.dtu.dk>>
> > wrote:
> >
> >
> >
> > I have proposed a Wordnet synset property here:
> >
> https://www.wikidata.org/wiki/Wikidata:Property_proposal/Wordnet_synset_ID
> >
> > The property has been discussed here on the mailing list more than a
> > year ago, but apparently never got to the point of a property
> > suggestion:
> >
> https://lists.wikimedia.org/pipermail/wikidata/2016-April/008517.html
> >
> > I am wondering how the potential property fits in with the new
> > development of the Wiktionary-Wikidata link. As far as I see the
> senses,
> > for instance, at
> http://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L15
> > link to wikidata-lexeme Q-items, which I suppose is Wikidata Q items
> > once the new development is put into the production system. So with
> my
> > understanding linking Wikidata Q-items to Wordnet synsets is
> correct. Is
> > my understanding correct?
> >
> >
> > Finn Årup Nielsen
> > http://people.compute.dtu.dk/faan/
> >
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> >
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wordnet synset ID

2017-08-15 Thread Denny Vrandečić

That's a great question, I have no idea what the answer will turn out to be.

Is there any current link between Wiktionary and WordNet? Or WordNet and
Wikipedia?


On Tue, Aug 15, 2017 at 10:14 AM  wrote:

>
>
> I have proposed a Wordnet synset property here:
> https://www.wikidata.org/wiki/Wikidata:Property_proposal/Wordnet_synset_ID
>
> The property has been discussed here on the mailing list more than a
> year ago, but apparently never got to the point of a property
> suggestion:
> https://lists.wikimedia.org/pipermail/wikidata/2016-April/008517.html
>
> I am wondering how the potential property fits in with the new
> development of the Wiktionary-Wikidata link. As far as I see the senses,
> for instance, at http://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L15
> link to wikidata-lexeme Q-items, which I suppose is Wikidata Q items
> once the new development is put into the production system. So with my
> understanding linking Wikidata Q-items to Wordnet synsets is correct. Is
> my understanding correct?
>
>
> Finn Årup Nielsen
> http://people.compute.dtu.dk/faan/
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata more prominent in Apple's IOS 11 beta

2017-07-10 Thread Denny Vrandečić

Chris,

thanks. That's cute cute ultimately disappointing - I would have preferred
for it to take me to "other people with this death place", which would be
more interesting.

Ah well, thanks on answering,
Denny

On Mon, Jul 10, 2017 at 7:48 AM Chris Koerner  wrote:

> Denny,
>
> Those ">" indicate you can tap on the field to perform another search.
> Tapping on Grover Cleveland's Deathplace of "Princeton" takes you to the
> entry on Princeton.
>
> http://imgur.com/a/w2a1p
>
> Yours,
> Chris Koerner
> clkoerner.com
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata more prominent in Apple's IOS 11 beta

2017-07-07 Thread Denny Vrandečić

What's the > to the far right of some of the values?

On Fri, Jul 7, 2017, 09:34 Chris Koerner  wrote:

> Hi all,
> An interesting discovery I recently made while working with the upcoming
> update to Apple's mobile devices. Siri, the speech recognition/personal
> assistant in the operating system, often responds to questions about many
> things with content from Wikimedia projects.
>
> I have two devices in my household and compared the differences in
> responses to the question, "Who is Gover Cleveland?"
>
> In the current OS, iOS 10: http://imgur.com/3sFUCZY
>
> In the current beta for the next OS, iOS 11: http://imgur.com/Usz8Ryx
>
> The description of the subject is a Wikidata description, and there are
> more fields about the subject. There are 4 visible in iOS 10. If you scroll
> through the results there are 11 in iOS 11.
>
>
> Yours,
> Chris Koerner
> clkoerner.com
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wiki PageID

2017-05-04 Thread Denny Vrandečić

Aren't both ... uhm ... "use cases" supported by dbpedia proper anyway?

On Thu, May 4, 2017 at 3:40 AM Kingsley Idehen 
wrote:

> On 5/3/17 3:37 PM, Nicholas Humfrey wrote:
> >
> > On 26/04/2017, 15:41, "Wikidata on behalf of Kingsley Idehen"
> >  kide...@openlinksw.com>
> > wrote:
> >
> >> Hi Nick,
> >>
> >> Please don't decommission dbpedialite, it does provide utility on other
> >> fronts too :)
> >>
> >
> > Hi Kingsley,
> >
> > Could you elaborate? I was planning on turning dbpedialite into 301
> > redirects to Wikidata for a period of time before switching it off.
> >
> >
> > nick.
> >
> >
> >
> > -
> > http://www.bbc.co.uk
> >
>
> Nick,
>
> dbpedialite provides a "best practices" demonstration for:
>
> 1. Linked Data Deployment -- i.e., it supports both content-negotiation
> and metadata embedded in HTML deployment options
>
> 2. Bridging across DBpedia, Wikidata, and Wikipedia -- this also
> provides value to DBpedia and Wikidata with regards to cross-reference
> reconcilliation.
>
> I believe the items above remain important :)
>
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software   (Home Page: http://www.openlinksw.com)
>
> Weblogs (Blogs):
> Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
> Blogspot Blog: http://kidehen.blogspot.com
> Medium Blog: https://medium.com/@kidehen
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
> :
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Languages in Wikidata4Wiktionary

2017-04-10 Thread Denny Vrandečić

Daniel, I agree, but isn't that what Multilingual Text requires? A language
code?

I.e. how does the current model plan to solve that?

I assume most of it is hidden behind mini-wizards like "Create a new
lexeme", which actually make sure the multitext language and the language
property are consistently set. In that case I can see this work.



On Mon, Apr 10, 2017 at 10:11 AM Daniel Kinzler 
wrote:

> Am 10.04.2017 um 18:56 schrieb Gerard Meijssen:
> > Hoi,
> > The standard for the identification of a language should suffice.
>
> I know no standard that would be sufficient for our use case.
>
> For instance, we not only need identifiers for German, Swiss and Austrian
> German. We also need identifiers for German German before and after the
> spelling
> reform of 1901, and before and ofter the spelling reform of 1996. We will
> also
> need identifiers for the "language" of mathematical notation. And for
> various
> variants of ancient languages: not just Sumerian, but Sumerian from
> different
> regions and periods.
>
> The only system I know that gives us that flexibility is Wikidata. For
> interoperability, we should provide a standard language code (aka subtag).
> But a
> language code alone is not going to be sufficient to distinguish the
> different
> variants we will need.
>
> --
> Daniel Kinzler
> Principal Platform Engineer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Languages in Wikidata4Wiktionary

2017-04-10 Thread Denny Vrandečić

So assume we enter a new Lexeme in Examplarian (which has a Q-Item), but
Examplarian has no language code for whatever reason. What language code
would they enter in the MultilingualTextValue?


On Mon, Apr 10, 2017 at 8:42 AM Daniel Kinzler 
wrote:

> Tobias' comment made me realize that I did not clarify wone very important
> distinction: there are two kinds of places where a "language" is needed in
> the
> Lexeme data model
> :
>
> 1) the "lexeme language". This can be any Item, language code or no. This
> is
> what Tobias would have to use in his query.
>
> 2) the language codes used in the MultilingualTextValues (lemma,
> representation,
> and gloss). This is where my "hybrid" approach comes in: use a standard
> language
> code augmented by an item ID to identify the variant.
>
> To make it easy to create new Lexemes, the lexeme language can serve as a
> default for lemma, representation, and gloss - but only if it has a
> language
> code. If it does not have one, the user will have to specify one for use in
> MultilingualTextValues.
>
>
> Am 06.04.2017 um 19:59 schrieb Tobias Schönberg:
> > An example using the second suggestion:
> >
> > If I would like to query all L-items that contain a combination of
> letters and
> > limit those results by getting the Q-items of the language and limit
> those, to
> > those that have Latin influences.
> >
> > In my imagination this would work better using the second suggestion.
> Also the
> > flexibility of "what is a language" and "what is a dialect" would seem
> easier if
> > we can attach statements to the UserLanguageCode or the Q-item of the
> language.
> >
> > -Tobias
>
>
> --
> Daniel Kinzler
> Principal Platform Engineer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Languages in Wikidata4Wiktionary

2017-04-07 Thread Denny Vrandečić

Scott,

I assume you realized that the article by Norvig you cited was rather
intentionally published on April 1st.

Cheers,
Denny

On Fri, Apr 7, 2017 at 11:04 AM Scott MacLeod <
worlduniversityandsch...@gmail.com> wrote:

> I tried to see how the ISO codes and IANA language subtags compare with
> Glottolog's 8,444 entries under languages (
> http://glottolog.org/glottolog/language) and Ethnologue's 7,099 living
> languages (https://www.ethnologue.com/), but couldn't find any
> comparisons or comparative lists.
>
> Will it be possible with these new developments in Wikidata to query for
> these possibilities, and leave the options open for a growing list of
> languages, as well as an universal translator?
>
> And how will invented languages be added, such as Krell, Elvish and
> Klingon (and even other species' languages in emergent interspecies'
> communications), and possibly per OpenNMT (Neural Machine Translation) -
> http://opennmt.net/ (and possibly GNMT); see also Peter Norvig's recent
> article in the regards to OpenNMT and invented languages -
> https://medium.com/@peternorvig/last-tweets-of-the-krell-82b8cb74c320 (and
> per
> http://scott-macleod.blogspot.com/2017/04/falco-peregrinus-smartphone-that-could.html
> ).
>
> Scott
>
>
>
> On Fri, Apr 7, 2017 at 10:13 AM, Daniel Kinzler <
> daniel.kinz...@wikimedia.de> wrote:
>
> Am 07.04.2017 um 01:34 schrieb Denny Vrandečić:
> > I foresee that might be a bit of a problem for external tools
> consuming
> > this data - how they would figure out what language it is if it's
> > doesn't have a code? We could of course generate fake codes like
> > mis-x-q12345, maybe that would work.
> >
> > Q-items for languages already have a property to state their language
> code. It's
> > just an extra hop away.
>
> We want ISO codes (or rather, IANA language subtags [1]), so we can use
> them in
> HTML lang attributes, and in RDF literals. This allows interoperability
> with
> standard tools.
>
> For this reason, I also favor a mixed approach, that allows standard
> language
> tags to be used whenever possible. I have some ideas on how that could
> work, but
> no definite plan yet.
>
> Something like de+Q1980305 could work; when generating HTML or RDF, we'd
> just
> drop the suffix. For transligual entries (e.g. the for number symbol i), we
> could use e.g. mis+Q1140046.
>
>
> [1]
>
> https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
>
> --
> Daniel Kinzler
> Principal Platform Engineer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
>
> --
>
> --
> - Scott MacLeod - Founder & President
> - World University and School
> - http://worlduniversityandschool.org
>
> - 415 480 4577 <(415)%20480-4577>
> - http://scottmacleod.com
>
>
> - CC World University and School - like CC Wikipedia with best
> STEM-centric CC OpenCourseWare - incorporated as a nonprofit university and
> school in California, and is a U.S. 501 (c) (3) tax-exempt educational
> organization.
>
>
> IMPORTANT NOTICE: This transmission and any attachments are intended only
> for the use of the individual or entity to which they are addressed and may
> contain information that is privileged, confidential, or exempt from
> disclosure under applicable federal or state laws.  If the reader of this
> transmission is not the intended recipient, you are hereby notified that
> any use, dissemination, distribution, or copying of this communication is
> strictly prohibited.  If you have received this transmission in error,
> please notify me immediately by email or telephone.
>
> World University and School is sending you this because of your interest
> in free, online, higher education. If you don't want to receive these,
> please reply with 'unsubscribe' in the body of the email, leaving the
> subject line intact. Thank you.
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Languages in Wikidata4Wiktionary

2017-04-06 Thread Denny Vrandečić

On Thu, Apr 6, 2017, 16:16 Stas Malyshev  wrote:

> Hi!
>
> > - use Q-Items instead of UserLanguageCodes for Multilingual texts (which
> > would be quite a migration)
>
> I foresee that might be a bit of a problem for external tools consuming
> this data - how they would figure out what language it is if it's
> doesn't have a code? We could of course generate fake codes like
> mis-x-q12345, maybe that would work.
>

Q-items for languages already have a property to state their language code.
It's just an extra hop away.

> > I don't think restricting Wiktionary4Wikidata support to the list of
> > languages with a UserLanguageCode is a viable solution, which would
> > happen if we implement the data model as currently suggested, if I
> > understand it correctly.
>
> Aren't we limiting it right now this way in Wikidata?
>

For labels and descriptions of items yes, and I think that was sensible. It
might be time to revisit that decision though.

But for supporting Wiktionary that would be extremely limiting. French
Wiktionary supports words in more than a thousand languages currently.
Limiting the supported languages of the lemmas is, IMHO, unacceptable.

> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Languages in Wikidata4Wiktionary

2017-04-06 Thread Denny Vrandečić

The current spec of the data model states that an L-Item has a lemma, a
language, and several forms, and the forms in turn have representations.

https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model

The language is a Q-Item, the lemma and the representations are
Multilingual Texts. Multilingual texts are sets of pairs of strings and
UserLanguageCodes.

My question is about the relation between representing a language as a
Q-Item and as a UserLanguageCode.

A previous proposal treated lemmas and representations as raw strings, with
the language pointing to the Q-Item being the only language information.
This now is gone, and the lemma and representation carry their own language
information.

How do they interact? The language set referencable through Q-Items is much
larger than the set of languages with a UserLanguageCode, and indeed, the
intention was to allow for every language to be representable in Wikidata,
not only those with a UserLanguageCode.

I sense quite a problem here.

I see two possible ways to resolve this:
- return to the original model and use strings instead of Multilingual
texts (with all the negative implications for variants)
- use Q-Items instead of UserLanguageCodes for Multilingual texts (which
would be quite a migration)

I don't think restricting Wiktionary4Wikidata support to the list of
languages with a UserLanguageCode is a viable solution, which would happen
if we implement the data model as currently suggested, if I understand it
correctly.

Cheers,
Denny
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata ontology

2017-01-09 Thread Denny Vrandečić

I agree with Peter here. Daniel's statement of "Anything that is a subclass
of X, and at the same an instance of Y, where Y is not "class", is
problematic." is simply too strong. The classical example is Harry the
eagle, and eagle being a species.

The following paper has a much more measured and subtle approach to this
question:

http://snap.stanford.edu/wikiworkshop2016/papers/Wiki_Workshop__WWW_2016_paper_11.pdf


I still think it is potentially and partially too strong, but certainly
much better than Daniel's strict statement.



On Mon, Jan 9, 2017 at 7:58 AM Peter F. Patel-Schneider <
pfpschnei...@gmail.com> wrote:

> On 01/09/2017 07:20 AM, Daniel Kinzler wrote:
> > Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch:
> >> Only the "current king of Iberia" is a single person, but Wikidata is
> about all
> >> of history, so there are many such kings. The office of "King of
> Iberia" is
> >> still singular (it is a singular class) and it can have its own
> properties etc.
> >> I would therefore say (without having checked the page):
> >>
> >> King of Iberiainstance of  office
> >> King of Iberiasubclass of  king
> >
> > To be semantically strict, you would need to have two separate items,
> one for
> > the office, and one for the class. Because the individual kinds have not
> been
> > instances of the office - they have been holders of the office. And they
> have
> > been instances of the class, but not holders of the class.
> >
> > On wikidata, we often conflate these things for sake of simplicity. But
> when you
> > try to write queries, this does not make things simpler, it makes it
> harder.
> >
> > Anything that is a subclass of X, and at the same an instance of Y,
> where Y is
> > not "class", is problematic. I think this is the root of the confusion
> Gerards
> > speaks of.
>
> There is no a priori reason that an office cannot be a class.  Some
> formalisms
> don't allow this, but there are others that do.  Some sets of rules for
> ontology construction don't allow this, but there are others that do.
> There
> is certainly no universal semantic consideration, even in any strict
> notion of
> semantics, that would require that there be two separate items here.
>
> As far as I can tell, the Wikidata formalism is not one that would disallow
> offices being classes.  As far as I can tell, the rules for constructing
> the
> Wikidata ontology don't disallow it either.
>
> Peter F. Patel-Schneider
> Nuance Communications
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] I'm calling it. We made it ;-)

2017-01-01 Thread Denny Vrandečić

http://i0.kym-cdn.com/entries/icons/original/000/001/899/mission_accomplished.jpg


On Sat, Dec 31, 2016, 02:58 Lydia Pintscher 
wrote:

> Folks,
>
> We're now officially mainstream ;-)
>
> https://www.buzzfeed.com/katiehasty/song-ends-melody-lingers-in-2016?utm_term=.nszJxrKqR#.sknE4nVAg
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] New Status Indicator Icon about Relative Page Completeness

2016-11-15 Thread Denny Vrandečić

*not

On Tue, Nov 15, 2016, 10:06 Denny Vrandečić  wrote:

> Do you make sure but to request the place of death or date of death on
> living people? I.e. can we filter certain properties?
>
> On Tue, Nov 15, 2016, 07:20 Simon Razniewski  wrote:
>
> On November 15, 2016, csara...@uni-koblenz.de wrote
>
> It would be very interesting to see the status of the interlinking to
> external sources through Recoin. I guess that the claims related to the
> interlinking (e.g. authority control properties
>
> <
> https://www.wikidata.org/wiki/Wikidata:List_of_properties/Generic#Authority_control
> >)
>
>
> will already appear in the explanations of Recoin, but it might be
> useful to show the status of the completeness of the interlinking
> separately. Maybe as a separate analogous indicator, or in the same
> indicator but with separate explanations.
>
>
> Hi Cristina,
> In fact, our focus was first on the "regular" properties, thus,
> unfortunately, we are filtering out exactly the ID properties at the moment.
> But it is no technical problem to set up a second indicator specifically
> for authority control properties, if that is of interest we can think about
> that. Would the same approach as for the facts, i.e., computing the color
> using a count-based on comparison with what other, similar entities have,
> make sense for claims about interlinking?
>
>
> Is the source code of Recoin available?
>
>
> It should become available, in fact, it is a pretty straightforward
> count-based comparison with the facts that entities of the same occupation
> have, I can send you the code immediately if you want, though
> albin.ahm...@gmail.com could tell you more about the technical setup.
>
> Cheers,
> Simon
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] New Status Indicator Icon about Relative Page Completeness

2016-11-15 Thread Denny Vrandečić

Do you make sure but to request the place of death or date of death on
living people? I.e. can we filter certain properties?

On Tue, Nov 15, 2016, 07:20 Simon Razniewski  wrote:

> On November 15, 2016, csara...@uni-koblenz.de wrote
>
> It would be very interesting to see the status of the interlinking to
> external sources through Recoin. I guess that the claims related to the
> interlinking (e.g. authority control properties
>
> <
> https://www.wikidata.org/wiki/Wikidata:List_of_properties/Generic#Authority_control
> >)
>
>
> will already appear in the explanations of Recoin, but it might be
> useful to show the status of the completeness of the interlinking
> separately. Maybe as a separate analogous indicator, or in the same
> indicator but with separate explanations.
>
>
> Hi Cristina,
> In fact, our focus was first on the "regular" properties, thus,
> unfortunately, we are filtering out exactly the ID properties at the moment.
> But it is no technical problem to set up a second indicator specifically
> for authority control properties, if that is of interest we can think about
> that. Would the same approach as for the facts, i.e., computing the color
> using a count-based on comparison with what other, similar entities have,
> make sense for claims about interlinking?
>
>
> Is the source code of Recoin available?
>
>
> It should become available, in fact, it is a pretty straightforward
> count-based comparison with the facts that entities of the same occupation
> have, I can send you the code immediately if you want, though
> albin.ahm...@gmail.com could tell you more about the technical setup.
>
> Cheers,
> Simon
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikimedia takes part in Google Code-in. Which small Wikidata tasks do you plan to mentor?

2016-11-07 Thread Denny Vrandečić

Would "create sets of Anki flash cards from Wikidata queries" be a viable
task?

On Mon, Nov 7, 2016 at 12:40 PM AMIT KUMAR JAISWAL 
wrote:

> Hey Andre,
>
> Glad to hear that! Thanks for letting us know about this great news.
>
> I'm interested to mentor for Google Code-in this year. I just added my
> name over :[Wikimedia
> Mentors](
> https://www.mediawiki.org/wiki/Google_Code-in_2016#Contacting_Wikimedia_mentors
> ).
> Please do invite me so that I can mentor Google Code-in this year.
>
> I'll be adding "Google-Code-In-2016" tag very soon.
>
> Looking forward to hear from you.
>
> Regards
> Amit Kumar Jaiswal
>
> On 11/8/16, Andre Klapper  wrote:
> > Wikimedia is among the 17 organizations in Google Code-in (GCI) 2016!
> > GCI starts on November 28th. It's a contest for 13-17 year old students
> > working on small tasks and a great opportunity to let new contributors
> > make progress and help with smaller tasks on your To-Do list!
> >
> > There are currently 23 open Wikidata tasks marked as easy:
> > https://phabricator.wikimedia.org/maniphest/query/zH.SGfhpQyC4/#R
> > (and a good bunch of them is already marked for GCI, thanks!)
> >
> > What we want you to do:
> >
> > BECOME A MENTOR:
> >
> > 1. Go to https://www.mediawiki.org/wiki/Google_Code-in_2016 and add
> > yourself to the mentor's table.
> > 2. Get an invitation email to register on the contest site.
> >
> > PROVIDE SMALL TASKS:
> >
> > We want your tasks in the following areas: code, outreach/research,
> > documentation/training, quality assurance, user interface/design.
> >
> > 1. Create a Phabricator task (which would take you 2-3h to complete) or
> > pick an existing Phabricator task you'd mentor.
> > 2. Add the "Google-Code-In-2016" project tag.
> > 3. Add a comment "I will mentor this in #GCI2016".
> >
> > Looking for task ideas? Check the "easy" tasks in Phabricator:
> > https://www.mediawiki.org/wiki/Annoying_little_bugs offers links.
> >
> > Make sure to cover expectations and deliverables in your task.
> > And once the contest starts on Nov 28, be ready to answer and review
> > contributions quickly.
> >
> > Any questions? Just ask, we're happy to help.
> >
> > Thank you for your help broadening our contributor base!
> > andre
> >
> > --
> > Andre Klapper | Wikimedia Bugwrangler
> > http://blogs.gnome.org/aklapper/
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
>
> --
> Amit Kumar Jaiswal
> [Mozilla Reps](reps.mozilla.org/u/amitkumarj441)
> [Fedora Contributor](https://fedoraproject.org/wiki/User:Amitkumarjaiswal)
> Kanpur | Uttar Pradesh | India
> Contact No.- +91-8081187743 <+91%2080811%2087743>
> [Web](http://amitkumarj441.github.io) | Twitter: @amit_gkp
> LinkedIn: http://in.linkedin.com/in/amitkumarjaiswal1
> PGP Key: EBE7 39F0 0427 4A2C
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Master thesis

2016-11-05 Thread Denny Vrandečić

Do you have already a few ideas? Just for brainstorming.

On Sat, Nov 5, 2016, 05:26 Nazanin Kamali 
wrote:

> Dear all,
>
> My name is Nazanin and I am a Wikipedia editor. I studied statistics in my
> undergraduate program and recently I have begun my master program also in
> statistics. According to my interests in Wikimedia projects, I would like
> to choose a Wikidata-related project as my master thesis. I have already
> worked on descriptive statistics and I look forward to working on Monte
> Carlo methods for forcasting, simulation and modelling. How can I help in
> Wikidara and how could we have a productive cooperation?
>
> All the best,
> Nazanin
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Psst! I am preparing a secret present...

2016-10-28 Thread Denny Vrandečić

... and maybe you want to help?

I have prepared a little present for the upcoming 4th birthday of Wikidata.
But in order to make it really great, it needs your input.

I won't tell you what it is in this email. It will be announced next week.
But if you want to contribute and make it a great launch, go to

https://www.wikidata.org/wiki/Wikidata:Everything_is_connected

and learn more. If you want to be surprised, then forget about this email :)

Let's not spread it yet widely on them social media sites and similar
things, but let's polish the whole thing for a week in the open :)

Q167545,
Denny
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikimedia Grants: Every language in the world

2016-10-17 Thread Denny Vrandečić

Yes, I do. I will write you privately.

On Sun, Oct 16, 2016 at 11:29 PM Daniel Bogre Udell 
wrote:

> Hi Denny,
>
> Thanks for that recommendation- I'll reach out to the Wikidata
> development team today. That being said, I'd still love to find a time to
> hop on a call and learn more about your role in the project, and to share
> more about what we're working on.
>
> Will you have any availability this week?
>
> Warmly,
> Daniel
>
> On Friday, October 14, 2016, Denny Vrandečić  wrote:
>
> Someone from the Wikidata development team would be better informed about
> timelines and development. If no one has time, I can offer a view into my
> understanding of the data model, but since I'm half a world away from the
> development there would be an obvious caveat.
>
> On Thu, Oct 13, 2016, 21:45 Daniel Bogre Udell 
> wrote:
>
> Hi Denny,
>
> Thank you so much for sending this over. A closer alignment between
> Wiktionary and Wikidata is very exciting, and a project that Poly has great
> potential to support. If you're close to this effort, I would love the
> opportunity to have a quick call and tell you more. Will you be
> available this coming week, between 10/17 and 10/21?
>
> Warmly,
> Daniel
>
> Daniel Bogre Udell
> Director
>
> *Wikitongues*
> www.wikitongues.org
> Every language in the world
>
> +1 (917) 975 1410
> @dbudell
>
> On Thu, Oct 13, 2016 at 11:50 PM, Denny Vrandečić 
> wrote:
>
> Hi Daniel,
>
> good luck with the proposal! Did you take a look at Wikidata's proposal to
> support Wiktionary?
>
> https://www.wikidata.org/wiki/Wikidata:Wiktionary
>
> Cheers,
> Denny
>
>
> On Thu, Oct 13, 2016 at 8:19 PM Daniel Bogre Udell 
> wrote:
>
> Hello, Wikidata community!
>
> My name is Daniel Bogre Udell and I'm a co-founder at Wikitongues, a
> non-profit organization and international volunteer community dedicated to
> defending linguistic diversity. We're building the world's first public
> archive of every language in the world, and counting some very enthusiastic
> Wikimedians among our ranks, we're excited to announce that we've submitted
> a proposal
> <https://meta.wikimedia.org/wiki/Grants:Project/Wikitongues_Poly_Feature_Set_1>
>  for
> this upcoming round of Wikimedia Project Grants to support Poly, an open
> source platform for sharing and learning languages.
>
> Poly stands to improve the language content for both Wiktionary and
> Wikivoyages by creating a broader network of language content aggregation.
> Through it, the breadth of these projects will be expanded and their
> language accuracy improved. Furthermore, in gaining access to new language
> communities working with Wikitongues, Wikipedia stands to benefit from the
> incubation of new language editions.
>
> We're eager for community feedback, so if you think there are more points
> of alignment between Poly and Wikidata, please let us know in the project's 
> discussion
> section
> <https://meta.wikimedia.org/wiki/Grants_talk:Project/Wikitongues_Poly_Feature_Set_1>
> .
>
> Finally, if you believe this project to be worthwhile, we would be honored
> to have your endorsement, which you can enter in the project's Endorsements
> section
> <https://meta.wikimedia.org/wiki/Grants:Project/Wikitongues_Poly_Feature_Set_1#Endorsements>.
> We would also greatly appreciate it if you can spread the word among your
> fellow Wikimedians.
>
> Thank you very much!
>
> Warmly,
>
> Daniel Bogre Udell
> Director
>
> *Wikitongues*
> www.wikitongues.org
> Every language in the world
>
> +1 (917) 975 1410 <(917)%20975-1410>
> @dbudell
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikimedia Grants: Every language in the world

2016-10-14 Thread Denny Vrandečić

Someone from the Wikidata development team would be better informed about
timelines and development. If no one has time, I can offer a view into my
understanding of the data model, but since I'm half a world away from the
development there would be an obvious caveat.

On Thu, Oct 13, 2016, 21:45 Daniel Bogre Udell 
wrote:

> Hi Denny,
>
> Thank you so much for sending this over. A closer alignment between
> Wiktionary and Wikidata is very exciting, and a project that Poly has great
> potential to support. If you're close to this effort, I would love the
> opportunity to have a quick call and tell you more. Will you be
> available this coming week, between 10/17 and 10/21?
>
> Warmly,
> Daniel
>
> Daniel Bogre Udell
> Director
>
> *Wikitongues*
> www.wikitongues.org
> Every language in the world
>
> +1 (917) 975 1410
> @dbudell
>
> On Thu, Oct 13, 2016 at 11:50 PM, Denny Vrandečić 
> wrote:
>
> Hi Daniel,
>
> good luck with the proposal! Did you take a look at Wikidata's proposal to
> support Wiktionary?
>
> https://www.wikidata.org/wiki/Wikidata:Wiktionary
>
> Cheers,
> Denny
>
>
> On Thu, Oct 13, 2016 at 8:19 PM Daniel Bogre Udell 
> wrote:
>
> Hello, Wikidata community!
>
> My name is Daniel Bogre Udell and I'm a co-founder at Wikitongues, a
> non-profit organization and international volunteer community dedicated to
> defending linguistic diversity. We're building the world's first public
> archive of every language in the world, and counting some very enthusiastic
> Wikimedians among our ranks, we're excited to announce that we've submitted
> a proposal
> <https://meta.wikimedia.org/wiki/Grants:Project/Wikitongues_Poly_Feature_Set_1>
>  for
> this upcoming round of Wikimedia Project Grants to support Poly, an open
> source platform for sharing and learning languages.
>
> Poly stands to improve the language content for both Wiktionary and
> Wikivoyages by creating a broader network of language content aggregation.
> Through it, the breadth of these projects will be expanded and their
> language accuracy improved. Furthermore, in gaining access to new language
> communities working with Wikitongues, Wikipedia stands to benefit from the
> incubation of new language editions.
>
> We're eager for community feedback, so if you think there are more points
> of alignment between Poly and Wikidata, please let us know in the project's 
> discussion
> section
> <https://meta.wikimedia.org/wiki/Grants_talk:Project/Wikitongues_Poly_Feature_Set_1>
> .
>
> Finally, if you believe this project to be worthwhile, we would be honored
> to have your endorsement, which you can enter in the project's Endorsements
> section
> <https://meta.wikimedia.org/wiki/Grants:Project/Wikitongues_Poly_Feature_Set_1#Endorsements>.
> We would also greatly appreciate it if you can spread the word among your
> fellow Wikimedians.
>
> Thank you very much!
>
> Warmly,
>
> Daniel Bogre Udell
> Director
>
> *Wikitongues*
> www.wikitongues.org
> Every language in the world
>
> +1 (917) 975 1410 <(917)%20975-1410>
> @dbudell
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikimedia Grants: Every language in the world

2016-10-13 Thread Denny Vrandečić

Hi Daniel,

good luck with the proposal! Did you take a look at Wikidata's proposal to
support Wiktionary?

https://www.wikidata.org/wiki/Wikidata:Wiktionary

Cheers,
Denny


On Thu, Oct 13, 2016 at 8:19 PM Daniel Bogre Udell 
wrote:

> Hello, Wikidata community!
>
> My name is Daniel Bogre Udell and I'm a co-founder at Wikitongues, a
> non-profit organization and international volunteer community dedicated to
> defending linguistic diversity. We're building the world's first public
> archive of every language in the world, and counting some very enthusiastic
> Wikimedians among our ranks, we're excited to announce that we've submitted
> a proposal
> 
>  for
> this upcoming round of Wikimedia Project Grants to support Poly, an open
> source platform for sharing and learning languages.
>
> Poly stands to improve the language content for both Wiktionary and
> Wikivoyages by creating a broader network of language content aggregation.
> Through it, the breadth of these projects will be expanded and their
> language accuracy improved. Furthermore, in gaining access to new language
> communities working with Wikitongues, Wikipedia stands to benefit from the
> incubation of new language editions.
>
> We're eager for community feedback, so if you think there are more points
> of alignment between Poly and Wikidata, please let us know in the project's 
> discussion
> section
> 
> .
>
> Finally, if you believe this project to be worthwhile, we would be honored
> to have your endorsement, which you can enter in the project's Endorsements
> section
> .
> We would also greatly appreciate it if you can spread the word among your
> fellow Wikimedians.
>
> Thank you very much!
>
> Warmly,
>
> Daniel Bogre Udell
> Director
>
> *Wikitongues*
> www.wikitongues.org
> Every language in the world
>
> +1 (917) 975 1410 <(917)%20975-1410>
> @dbudell
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Query Help - Stock Symbols and Wikipedia Pages

2016-10-13 Thread Denny Vrandečić

As far as I can tell on https://www.wikidata.org/wiki/Q156578 VW indeed has
three different stock symbols.

On Thu, Oct 13, 2016 at 1:34 PM Hampton Snowball 
wrote:

> Hi Stas - Thank you so much for your response! It seems the difference
> between 1 and 2 is that for the same company name, their are multiple
> symbols?
>
> In terms of labeling the symbols - i'd be nice to have that label
> information (nyse) in another column, however I can get away with this as
> is so thank you for your help!
>
>
> On Thu, Oct 13, 2016 at 3:50 PM, Stas Malyshev 
> wrote:
>
> Hi!
>
> > There are only 73 entries (including some duplicates) showing up for a
> > straight P249 search, which is a bit strange in any case -
> >
> >
> https://query.wikidata.org/#SELECT%20%3Fitem%20%20%3Fstock%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP249%20%3Fstock%20.%0A%20%7D
> >
> > Sampling a few items suggests that most businesses don't use P249
> > directly, instead they use it as a qualifier on P414, eg
> > https://www.wikidata.org/wiki/Q156578 or
> > https://www.wikidata.org/wiki/Q156238
>
> You can use something like this: http://tinyurl.com/hdoqrjt
>
> The key thing there is UNION - that's how you do "this or that".
>
> Note though ticker symbols are not unique (they are per exchange) so you
> may want to either group: http://tinyurl.com/jmbrf4p or add stock
> exchange name to it or do something else, depending on your case.
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Using wikibase for creating structured vocabularies collaboratively

2016-10-12 Thread Denny Vrandečić

I don't know if Jakob Voss is on this list, but he had a recent paper on
using Wikidata (not Wikibase) for NKOS terminology.

On Wed, Oct 12, 2016 at 1:32 PM Gregor Hagedorn <
gregor.haged...@mfn-berlin.de> wrote:

> Here some old pointers to our TDWG - biodiversity - ViBRANT work from 2013:
> http://www.gbif.org/resource/80862
> https://mbgserv18.mobot.org/ocs/index.php/tdwg/2013/paper/view/545
> creating this SMW wiki:
> http://terms.tdwg.org/wiki/
> (with a SKOS vocab management system)
>
> My own assessment: there are serious limitation in SMW, which can be
> worked with (e.g. we use an in-wiki xslt script to post-process), but
> having a more powerful system like wikidata would be most welcome.
>
> However, to my present knowledge, the wikidata/wikibase based system would
> have to have a custom programmed user interface to make it usable,
> something the great semantic form extensions in SMW already make possible.
>
> Best
>
> gregor
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Using wikibase for creating structured vocabularies collaboratively

2016-10-12 Thread Denny Vrandečić

No, wouldn't know of any. Wikibase is still rather new in this field, and
to the best of my knowledge there has been no evaluation of the two for the
task of collaborative vocabulary management.

On Wed, Oct 12, 2016 at 1:20 PM Claudia Müller-Birn 
wrote:

> Dear Denny,
>
> Thanks for the pointers, some of them are surprisingly new to me. However,
> I was wondering if there are articles that especially compared Wikibase and
> Semantic MediaWiki. Any idea?
>
> Many thanks.
>
> Claudia
>
> > On 12 Oct 2016, at 22:05, Denny Vrandečić  wrote:
> >
> > SMW has been used successfully in the past for such projects, indeed has
> been developed with that in mind. A few papers here:
> >
> > http://www.semantic-cora.org/index.php/Publications_and_Talks
> >
> > http://www.aifb.kit.edu/web/Frank_Dengler/Publikationen/en
> >
> >
> https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PdsmmEkJ&citation_for_view=PdsmmEkJ:IjCSPb-OGe4C
> >
> https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PdsmmEkJ&citation_for_view=PdsmmEkJ:2osOgNQ5qMEC
> >
> https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PdsmmEkJ&citation_for_view=PdsmmEkJ:Tyk-4Ss8FVUC
> >
> > https://km.aifb.kit.edu/ws/ckc2007/accepted.htm#papers
> >
> > But these are just a few pointers, and I am sure it is quite out of that
> by now.
> >
> > On Wed, Oct 12, 2016 at 12:54 PM Federico Leva (Nemo) <
> nemow...@gmail.com> wrote:
> > Claudia Müller-Birn, 12/10/2016 21:11:
> > > wonder if there are other projects than Wikidata that are using
> Wikibase for structuring their data
> >
> > http://www.eagle-network.eu/wiki/index.php/Main_Page , as documented on
> > http://wikiba.se/projects/ . You can ask on this list if you have
> questions.
> >
> > Nemo
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Using wikibase for creating structured vocabularies collaboratively

2016-10-12 Thread Denny Vrandečić

SMW has been used successfully in the past for such projects, indeed has
been developed with that in mind. A few papers here:

http://www.semantic-cora.org/index.php/Publications_and_Talks

http://www.aifb.kit.edu/web/Frank_Dengler/Publikationen/en

https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PdsmmEkJ&citation_for_view=PdsmmEkJ:IjCSPb-OGe4C

https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PdsmmEkJ&citation_for_view=PdsmmEkJ:2osOgNQ5qMEC

https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PdsmmEkJ&citation_for_view=PdsmmEkJ:Tyk-4Ss8FVUC


https://km.aifb.kit.edu/ws/ckc2007/accepted.htm#papers

But these are just a few pointers, and I am sure it is quite out of that by
now.

On Wed, Oct 12, 2016 at 12:54 PM Federico Leva (Nemo) 
wrote:

> Claudia Müller-Birn, 12/10/2016 21:11:
> > wonder if there are other projects than Wikidata that are using Wikibase
> for structuring their data
>
> http://www.eagle-network.eu/wiki/index.php/Main_Page , as documented on
> http://wikiba.se/projects/ . You can ask on this list if you have
> questions.
>
> Nemo
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Changing the datatype of a property in test.wikidata.org

2016-10-11 Thread Denny Vrandečić

Properties don't have an identity besides their ID - so you could in
test.wd just rename the subclass of property and reclaim that name.

On Tue, Oct 11, 2016 at 1:19 PM Loic Dachary  wrote:

> Hi,
>
> I'd like to run integration tests using test.wikidata.org[1] and the
> "subclass of" property. Unfortunately it already exist ... with a datatype
> that is different from what wikidata.org has (wikibase-item)[2]. It does
> not seem to be possible to change that via the API[3]. Deleting a property
> does not seem possible either.  asked a month ago in the thread "Deleting
> properties / items in test.wikidata.org".
>
> It's not an absolute blocker: I can add code so that "subclass of" is a
> random string instead when test.wikidata.org is used. But before doing
> that and making integration tests more difficult to debug and implement,
> I'd like to be sure I'm not missing something.
>
> How do you suggest I deal with this problem ?
>
> Cheers
>
> [1]
> https://phabricator.wikimedia.org/diffusion/PBFB/browse/master/tests/test_repository.py
> [2] https://test.wikidata.org/wiki/Property:P748
> [3] https://www.wikidata.org/w/api.php?action=help&modules=wbeditentity
> --
> Loïc Dachary, Artisan Logiciel Libre
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata weekly summary #230

2016-10-10 Thread Denny Vrandečić

Thank you!

On Mon, Oct 10, 2016 at 1:10 PM Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Mon, Oct 10, 2016 at 7:43 PM, Denny Vrandečić 
> wrote:
> > Thanks!
> >
> > How do I read the WDQS report that is linked? I only see the link to the
> rmd
> > source.
> >
> >
> https://github.com/wikimedia-research/Discovery-WDQS-Adhoc-Usage/blob/master/Report.Rmd
>
> Looks like this one is better:
>
> https://github.com/wikimedia-research/Discovery-WDQS-Adhoc-Usage/blob/master/Report.md
> Sorry I just copied the link from a list of reports on the query
> service page on mediawiki iirc.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata weekly summary #230

2016-10-10 Thread Denny Vrandečić

Thanks!

How do I read the WDQS report that is linked? I only see the link to the
rmd source.

https://github.com/wikimedia-research/Discovery-WDQS-Adhoc-Usage/blob/master/Report.Rmd


On Mon, Oct 10, 2016 at 9:13 AM Léa Lacroix 
wrote:

>
>
> *Hello all,Here's your quick overview of what has been happening around
> Wikidata over the last week.*
>
> Events /Press/Blogs
> 
>
>- During the British wildlife edit-a-thon 2016
>
> ,
>attendees added bird sounds from Europeana Sounds to Wikidata and over 60
>Wikipedias
>- Past: WikiConference in San Diego, USA
>
>- Upcoming: Wikidata workshop
> by Wikimedia
>Nederland, October 20, Utrecht
>- by Amical Wikimedia (ca)
>- A Natural Language Query Engine without Machine Learning
>,
>on A Young Programmer's blog
>- The Wikimedia Foundation will now directly fund basic expenses for
>Wikidata software development
>,
>on WMF blog
>- People buried on cemetery and if they have a picture of the grave or
>not
>
> 
>by Magnus Sälgö
>- Sunday Query : où meurent les Bretons ?
> (fr) by
>Ash_Crow
>- Charts and data about Brexit & US Elections
>
> ,
>by Hector Perez
>
> Other Noteworthy Stuff
>
>- Proposition
>
>for upgrading the default copyright license for Wikimedia projects to
>CC-by-SA 4.0 (does not affect the structured data part of Wikidata, which
>uses CC0).
>- 3 Wikidata-related projects will be funded by WMF grants :
>Librarybase
>
> ,
>Lua module
>
>and WikiFactMine
>
>- Query service analysis: What kind of things are people doing with
>WDQS?
>
> 
>and Who are our WDQS users and where are they from?
>
> 
>- chemical compounds in Wikipedia and Wikidata
>,
>talk by Sebastian Burgstaller
>- PetScan tool  past half a
>million queries since April (source
>)
>
> Did you know?
>
>- Newest properties
>: KANTL member ID
>, Angel List ID
>, storyboard artist
>, content deliverer
>, Actorenregister ID
>, Zeri image ID
>, compulsory education
>(maximum age) , compulsory
>education (minimum age) ,
>Fotografen.nl ID , PORT
>organization URL , Flickr
>user ID , LocFDD ID
>, MySpace ID
>, radix
>, base
>, has anatomical branch
>, anatomical branch of
>, points awarded
>, intangible cultural
>heritage status , LiveJournal
>ID , Queensland place ID
>, Cave E-Cadastre ID
>, Property pr

Re: [Wikidata] [Wikimedia-l] GPS data shift

2016-10-07 Thread Denny Vrandečić

Uh, I leave the details to someone who knows better :) - it is a while
since I checked, and it might indeed be underspecified right now.

To the best of my knowledge, there is only one widely used coordinate
system for each Mars and Titan. I might be wrong. But in the worst case we
would need to specify the default system for either.

I am not saying that the whole thing is not a problem - I am just saying
that the data model, as spec'ed and implemented, has a space for solving
it. It is obvious that without support in the UI the whole thing is
slightly moot anyway.

On Fri, Oct 7, 2016 at 2:23 PM Jan Macura  wrote:

>
> 2016-10-07 20:34 GMT+02:00 Denny Vrandečić :
>
> Wikidata allows to set a coordinate system - it is called a globe or
> coordinate system - on every coordinate. This would be the natural place to
> specify whether it is WGS84 or GDA94 or another system. Most of them are
> Q2, which, as per data model, is indeed WGS84
>
>
> Hi Denny,
>
> can you be more specific about this? So when there is no explicit value in
> the *globe* parametre of GlobeCoordinate, then it is treated as Q2 (this
> corelates with the dumps and every RDF serialization)? It would imply
> geographic coordinates (not the same as WGS84!!). Or is it considered to be
> specifically WGS84, which is Q11902211?
> And how you tell the coordinate system for other celestial bodies like
> Q111 (Mars) or Q2565 (Titan)?
>
> Thanks a lot
>  Jan
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [Wikimedia-l] GPS data shift

2016-10-07 Thread Denny Vrandečić

Wikidata allows to set a coordinate system - it is called a globe or
coordinate system - on every coordinate. This would be the natural place to
specify whether it is WGS84 or GDA94 or another system. Most of them are
Q2, which, as per data model, is indeed WGS84.

https://www.mediawiki.org/wiki/Wikibase/DataModel#Geographic_locations

Unfortunately this is currently not being displayed or edited in the UI,
but the backend has the data. In theory.



On Tue, Oct 4, 2016 at 10:17 PM Sam Klein  wrote:

> On Tue, Oct 4, 2016 at 2:01 AM, Joseph Seddon 
> wrote:
>
> > currently there is no clear indication within Wikipedia articles
> > and as far as I can tell within Wikidata as to both what *datum* and what
> > *version* any particular coordinate relates to, there is no guarantee
> that
> > any particular coordinate would be any more correct than it was before.
> >
>
> This definitely should be fixed on the wikidata side.  Whether article
> editors are savvy enough to know and enter this data is another question;
> but at least the geotemplates should have fields for it and you can assume
> that if those are empty some {person/bot hybrid} that understands that
> nuance should fill them in.
>
> ~S
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> New messages to: wikimedi...@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-06 Thread Denny Vrandečić

DBpedia has the category data in RDF. The last release of DBpedia also
included Wikidata - it should be possible to query the combined dataset
there.

On Thu, Oct 6, 2016, 19:47 Thad Guidry  wrote:

> Cool.  Yeap, you got the idea now.
>
> OK, We'll stay tuned for a future new Service !
>
> Schema.org and Google and the World thank you Stas :)
>
>
> On Thu, Oct 6, 2016 at 8:39 PM Stas Malyshev 
> wrote:
>
> Hi!
>
> > This is what I was thinking would have already been integrated into
> > Wikidata somehow, somewhere, ideally as a SERVICE to call in SPARQL if
> > there were any Category Labels , maybe another serviceParam for the
> > label Service that could take another serviceParam to retrieve the
> > subcategories ?
>
> Right now, we do not support calls out to external services, due to
> security issues it may produce. However, making limited API to
> specifically mediawiki API may be possible, it's actually an idea we
> haven't considered before and may be possible to do.
> Needs some thinking though, so don't expect it to be done by next week
> :) but interesting, I'll look into it.
>
> We probably will need to use generator since we'd need Wikidata IDs, but
> something like this:
>
>
> /w/api.php?action=query&format=json&prop=pageprops&generator=categorymembers&ppprop=wikibase_item&gcmtitle=Category%3AParking&gcmprop=ids%7Ctitle&gcmtype=subcat
>
> should work. It needs some code to be able to properly build and consume
> queries, but not impossible.
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] SPARQL power users and developers

2016-09-30 Thread Denny Vrandečić

Markus, do you have access to the corresponding HTTP request logs? The
fields there might be helpful (although I might be overtly optimistic about
it)

On Fri, Sep 30, 2016 at 11:38 AM Yuri Astrakhan 
wrote:

> I guess I qualify for #2 several times:
> * The  &  support access to the geoshapes service,
> which in turn can make requests to WDQS. For example, see
> https://en.wikipedia.org/wiki/User:Yurik/maplink  (click on "governor's
> link")
>
> * The  wiki tag supports the same geoshapes service, as well as
> direct queries to WDQS. This graph uses both (one to get all countries, the
> other is to get the list of disasters)
>
> https://www.mediawiki.org/wiki/Extension:Graph/Demo/Sparql/Largest_disasters
>
> * There has been some discussion to allow direct WDQS querying from maps
> too - e.g. to draw points of interest based on Wikidata (very easy to
> implement, but we should be careful to cache it properly)
>
> Since all these queries are called from either nodejs or our javascript,
> we could attach extra headers, like X-Analytics, which is already handled
> by Varnish.  Also, NodeJS queries could set the user agent string.
>
>
> On Fri, Sep 30, 2016 at 10:44 AM Markus Kroetzsch <
> markus.kroetz...@tu-dresden.de> wrote:
>
>> On 30.09.2016 16:18, Andra Waagmeester wrote:
>> > Would it help if I add the following header to every large batch of
>> queries?
>> >
>> > ###
>> > # access: (http://query.wikidata.org
>> > or
>> https://query.wikidata.org/bigdata/namespace/wdq/sparql?query={SPARQL} .)
>> > # contact: email, acountname, twittername etc
>> > # bot: True/False
>> > # .
>> > ##
>>
>> This is already more detailed than what I had in mind. Having a way to
>> tell apart bots and tools from "organic" queries would already be great.
>> We are mainly looking for something that will help us to understand
>> sudden peaks of activity. For this, it might be enough to have a short
>> signature (a URL could be given, but a tool name with a version would
>> also be fine). This is somewhat like the "user agent" field in HTTP.
>>
>> But you are right that some formatting convention may help further here.
>> How about this:
>>
>> #TOOL:
>>
>> Then one could look for comments of this form without knowing all the
>> tools upfront. Of course, this is just a hint in any case, since one
>> could always use the same comment in any manually written query.
>>
>> Best regards,
>>
>> Markus
>>
>> >
>> > On Fri, Sep 30, 2016 at 4:00 PM, Markus Kroetzsch
>> > mailto:markus.kroetz...@tu-dresden.de
>> >>
>> > wrote:
>> >
>> > Dear SPARQL users,
>> >
>> > We are starting a research project to investigate the use of the
>> > Wikidata SPARQL Query Service, with the goal to gain insights that
>> > may help to improve Wikidata and the query service [1]. Currently,
>> > we are still waiting for all data to become available. Meanwhile, we
>> > would like to ask for your input.
>> >
>> > Preliminary analyses show that the use of the SPARQL query service
>> > varies greatly over time, presumably because power users and
>> > software tools are running large numbers of queries. For a
>> > meaningful analysis, we would like to understand such high-impact
>> > biases in the data. We therefore need your help:
>> >
>> > (1) Are you a SPARQL power user who sometimes runs large numbers of
>> > queries (over 10,000)? If so, please let us know how your queries
>> > might typically look so we can identify them in the logs.
>> >
>> > (2) Are you the developer of a tool that launches SPARQL queries? If
>> > so, then please let us know if there is any way to identify your
>> > queries.
>> >
>> > If (1) or (2) applies to you, then it would be good if you could
>> > include an identifying comment into your SPARQL queries in the
>> > future, to make it easier to recognise them. In return, this would
>> > enable us to provide you with statistics on the usage of your tool
>> [2].
>> >
>> > Further feedback is welcome.
>> >
>> > Cheers,
>> >
>> > Markus
>> >
>> >
>> > [1]
>> >
>> https://meta.wikimedia.org/wiki/Research:Understanding_Wikidata_Queries
>> > <
>> https://meta.wikimedia.org/wiki/Research:Understanding_Wikidata_Queries>
>> >
>> > [2] Pending permission by the WMF. Like all Wikimedia usage data,
>> > the query logs are under strict privacy protection, so we will need
>> > to get clearance before sharing any findings with the public. We
>> > hope, however, that there won't be any reservations against
>> > publishing non-identifying information.
>> >
>> > --
>> > Prof. Dr. Markus Kroetzsch
>> > Knowledge-Based Systems Group
>> > Faculty of Computer Science
>> > TU Dresden
>> > +49 351 463 38486 
>> > https://iccl.inf.tu-dresden.de/web/KBS/en
>> > 
>> >
>> > ___
>> > Wikidata mailing li

Re: [Wikidata] List of WP-languages and language short code

2016-09-30 Thread Denny Vrandečić

Markus, it really depends on what you mean with "a list of all Wikimedia
languages". That is why you get different numbers.

Usually, you will have a use case for this list, and depending on that use
case you should select the languages you really care about. Besides looking
good in some marketing terms, having a complete list from a random
collection is barely a real use case. Which languages do you want to list,
what is this list for, and what will happen if a specific language is
within that list or not?

Just my 2c

On Fri, Sep 30, 2016 at 10:04 AM Stephen Niedzielski <
sniedziel...@wikimedia.org> wrote:

> I apologize but I'm coming into this thread a little late. FWIW, I wanted
> to mention that this is what we use to generate a list of languages[0] for
> the Wikipedia Android app.
>
> Less to your question but similar, we also generate localized main page
> title and site name[1], and namespace titles[2]. There's also a related
> thread in T121936[3] discussing the localized name of all Wikimedia sites.
>
> [0]
> https://wikistats.wmflabs.org/api.php?action=dump&table=wikipedias&format=csv&s=good
> [1]
> https://es.wikipedia.org/w/api.php?action=query&meta=siteinfo&format=json&siprop=general
> [2]
> https://fr.wikipedia.org/w/api.php?action=query&meta=siteinfo&format=json&siprop=namespaces
> [3] https://phabricator.wikimedia.org/T121936
>
> On Fri, Sep 30, 2016 at 3:21 AM, Markus Bärlocher <
> markus.baerloc...@lau-net.de> wrote:
>
>> Thanks Denny for this:
>> > is this what you look for?
>> > It's based on the query by Jérémie a few mails back.
>> > http://tinyurl.com/h2zky8p
>>
>> Yes this helps very much :-)
>>
>> But there are some inconsistencies:
>>
>> 360 WP-languages from your query
>> 294 WP-languages in https://meta.wikimedia.org/wiki/List_of_Wikipedias
>> 277 WP-languages in https://de.wikipedia.org/wiki/Wikipedia:Sprachen
>>
>> 73 'localname' in the query are empty
>>
>> 9 'name' in the query are double
>> (ell, gsw, lzh, mrj, nan, ron, rup, sgs, vro)
>>
>> There are some problems with characters/charset
>> when I copy&paste the query result to Excel:
>> Burmese, Gothic, Northern Thai, shows only squares as 'local language'
>>
>> But copying after the squares from Excel to Thunderbird works:
>> Burmese မြန်မာဘာသာစကား  mya
>> Gothic language 𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰  got
>> Northern Thai   ᨣᩴᩤᨾᩮᩥᩬᨦnod
>>
>> All the rest works fine :-)
>>
>> It would be good to work out a valid and complete table
>> of the WP-languages...
>>
>> Best regards,
>> Markus
>>
>>
>>
>>
>>
>>
>> > Until now I have a CSV with 294 languages in two columns:
>> > - name of language in English
>> > - name of language local in local writing system
>> >
>> > I look for:
>> > - ISO-639-3 short code (corresponding to the names)
>> >
>> > Best regards,
>> > Markus
>> >
>> >
>> > ___
>> > Wikidata mailing list
>> > Wikidata@lists.wikimedia.org > Wikidata@lists.wikimedia.org>
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>> >
>> >
>> > ___
>> > Wikidata mailing list
>> > Wikidata@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] List of WP-languages and language short code

2016-09-29 Thread Denny Vrandečić

Uhm, sorry. I meant: is this what you look for?

It's based on the query by Jérémie a few mails back.

http://tinyurl.com/h2zky8p



On Thu, Sep 29, 2016 at 1:13 PM Denny Vrandečić  wrote:

> Is that what yu look for?
>
> On Thu, Sep 29, 2016 at 10:56 AM Markus Bärlocher <
> markus.baerloc...@lau-net.de> wrote:
>
>> Back home from a longer trip I would like to try again:
>>
>> Until now I have a CSV with 294 languages in two columns:
>> - name of language in English
>> - name of language local in local writing system
>>
>> I look for:
>> - ISO-639-3 short code (corresponding to the names)
>>
>> Best regards,
>> Markus
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] List of WP-languages and language short code

2016-09-29 Thread Denny Vrandečić

Is that what yu look for?

On Thu, Sep 29, 2016 at 10:56 AM Markus Bärlocher <
markus.baerloc...@lau-net.de> wrote:

> Back home from a longer trip I would like to try again:
>
> Until now I have a CSV with 294 languages in two columns:
> - name of language in English
> - name of language local in local writing system
>
> I look for:
> - ISO-639-3 short code (corresponding to the names)
>
> Best regards,
> Markus
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-23 Thread Denny Vrandečić

(I half expected that link to be paywalled - fortunately it wasn't.)

Thanks!

On Fri, Sep 23, 2016 at 9:10 AM Egon Willighagen 
wrote:

> On Fri, Sep 23, 2016 at 5:53 PM, Denny Vrandečić 
> wrote:
>
>> One stupid question: due to the length of these identifiers, and since
>> they are not simple intransparent identifiers but rather encode semantics -
>> if I understand it correctly - could a single such identifier be encoding
>> content or ideas which are potentially covered by copyright or patent law?
>> Is there some background available on that?
>>
>
>
> Not the InChI. The standard itself is meant to be reused as much as
> possible and the software is open source.
>
> Some information here:
> http://jcheminf.springeropen.com/articles/10.1186/1758-2946-5-7
>
> Egon
>
>
>
>> On Fri, Sep 23, 2016 at 3:27 AM Egon Willighagen <
>> egon.willigha...@gmail.com> wrote:
>>
>>>
>>> Sebastian, great you found time for it! I didn't :/ (Stats are worth a
>>> tweet, IMHO :)
>>>
>>> Egon
>>>
>>> On Fri, Sep 23, 2016 at 12:20 PM, Sebastian Burgstaller <
>>> sebastian.burgstal...@gmail.com> wrote:
>>>
>>>> Hi Denny,
>>>> Sorry, I missed this email. just did the calculation for InChI string
>>>> lengths on the 92 Mio PubChem compounds:
>>>>   99% 99.9%  100%
>>>>   311   676  4502
>>>>
>>>> That said, there is not upper limit for the length, but 4502 is the
>>>> longest string in the PubChem database. The other IDs, canonical and
>>>> isomeric SMILES have the same distribution shape, but are overall
>>>> slightly shorter.
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>> On Sun, Sep 18, 2016 at 9:19 PM, Denny Vrandečić 
>>>> wrote:
>>>> > Can you figure out what a good limit would be for these two use
>>>> cases? I.e.
>>>> > what would support 99%, 99.9%, and 100%?
>>>> >
>>>> >
>>>> > On Sun, Sep 18, 2016, 12:27 Egon Willighagen <
>>>> egon.willigha...@gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> Hi all,
>>>> >>
>>>> >> sorry for joining the party late...
>>>> >>
>>>> >> On Tue, Sep 13, 2016 at 11:39 AM, Sebastian Burgstaller
>>>> >>  wrote:
>>>> >> > I think this topic might have been discussed many months ago. For
>>>> >> > certain data types in the chemical compound space (P233, canonical
>>>> >> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher
>>>> character
>>>> >> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
>>>> >> > that this might cause problems with SPARQL)). Are there any plans
>>>> on
>>>> >> > implementing this? In general, for quality assurance, many string
>>>> >> > property types would profit from a fixed max string length.
>>>> >>
>>>> >> 400 characters is not a lot for chemicals... InChIs can be a lot
>>>> >> larger indeed. 2k would allow us to capture a lot more chemicals.
>>>> BTW,
>>>> >> this also applies to the canonical SMILES, which also doesn't have an
>>>> >> upper bound. Tannic acid (Q427956) is an example (which looking at
>>>> the
>>>> >> InChIKey came up when running the bot :) From working with ChEMBL as
>>>> >> RDF I know it has InChIs of length > 1024, which was the max length
>>>> in
>>>> >> Virtuoso... I think it's important for the biology and chemistry to
>>>> >> increase the limit.
>>>> >>
>>>> >> Egon
>>>> >>
>>>> >> --
>>>> >> E.L. Willighagen
>>>> >> Department of Bioinformatics - BiGCaT
>>>> >> Maastricht University (http://www.bigcat.unimaas.nl/)
>>>> >> Homepage: http://egonw.github.com/
>>>> >> LinkedIn: http://se.linkedin.com/in/egonw
>>>> >> Blog: http://chem-bla-ics.blogspot.com/
>>>> >> PubList: http://www.citeulike.org/user/egonw/tag/papers
>>>> >> ORCID: -0001-7542-0286
>>>> >> ImpactStory: https://impactstory.org/EgonWillighagen
>>>> >>
>>>> >> _

Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-23 Thread Denny Vrandečić

Thank you! I am sure that this will help the Wikidata team to make the
right decision. Also, very interesting numbers.

One stupid question: due to the length of these identifiers, and since they
are not simple intransparent identifiers but rather encode semantics - if I
understand it correctly - could a single such identifier be encoding
content or ideas which are potentially covered by copyright or patent law?
Is there some background available on that?

On Fri, Sep 23, 2016 at 3:27 AM Egon Willighagen 
wrote:

>
> Sebastian, great you found time for it! I didn't :/ (Stats are worth a
> tweet, IMHO :)
>
> Egon
>
> On Fri, Sep 23, 2016 at 12:20 PM, Sebastian Burgstaller <
> sebastian.burgstal...@gmail.com> wrote:
>
>> Hi Denny,
>> Sorry, I missed this email. just did the calculation for InChI string
>> lengths on the 92 Mio PubChem compounds:
>>   99% 99.9%  100%
>>   311   676  4502
>>
>> That said, there is not upper limit for the length, but 4502 is the
>> longest string in the PubChem database. The other IDs, canonical and
>> isomeric SMILES have the same distribution shape, but are overall
>> slightly shorter.
>>
>> Best,
>> Sebastian
>>
>> On Sun, Sep 18, 2016 at 9:19 PM, Denny Vrandečić 
>> wrote:
>> > Can you figure out what a good limit would be for these two use cases?
>> I.e.
>> > what would support 99%, 99.9%, and 100%?
>> >
>> >
>> > On Sun, Sep 18, 2016, 12:27 Egon Willighagen <
>> egon.willigha...@gmail.com>
>> > wrote:
>> >>
>> >> Hi all,
>> >>
>> >> sorry for joining the party late...
>> >>
>> >> On Tue, Sep 13, 2016 at 11:39 AM, Sebastian Burgstaller
>> >>  wrote:
>> >> > I think this topic might have been discussed many months ago. For
>> >> > certain data types in the chemical compound space (P233, canonical
>> >> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
>> >> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
>> >> > that this might cause problems with SPARQL)). Are there any plans on
>> >> > implementing this? In general, for quality assurance, many string
>> >> > property types would profit from a fixed max string length.
>> >>
>> >> 400 characters is not a lot for chemicals... InChIs can be a lot
>> >> larger indeed. 2k would allow us to capture a lot more chemicals. BTW,
>> >> this also applies to the canonical SMILES, which also doesn't have an
>> >> upper bound. Tannic acid (Q427956) is an example (which looking at the
>> >> InChIKey came up when running the bot :) From working with ChEMBL as
>> >> RDF I know it has InChIs of length > 1024, which was the max length in
>> >> Virtuoso... I think it's important for the biology and chemistry to
>> >> increase the limit.
>> >>
>> >> Egon
>> >>
>> >> --
>> >> E.L. Willighagen
>> >> Department of Bioinformatics - BiGCaT
>> >> Maastricht University (http://www.bigcat.unimaas.nl/)
>> >> Homepage: http://egonw.github.com/
>> >> LinkedIn: http://se.linkedin.com/in/egonw
>> >> Blog: http://chem-bla-ics.blogspot.com/
>> >> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> >> ORCID: -0001-7542-0286
>> >> ImpactStory: https://impactstory.org/EgonWillighagen
>> >>
>> >> ___
>> >> Wikidata mailing list
>> >> Wikidata@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>> >
>> > ___
>> > Wikidata mailing list
>> > Wikidata@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/u/egonwillighagen
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-21 Thread Denny Vrandečić

It is no coincidence that the Wikidata Wiktionary data model and OntoLex
fit well together: Wikidata was informed and followed the Lemon data model
closely, and OntoLex also is rooted in Lemon.

It's good that both built on the same solid results from linguistics ;)


On Wed, Sep 21, 2016 at 10:55 AM Ester Pantaleo 
wrote:

> The DBnary project "Wiktionary as Linguistic Linked Open Data" at:
>
> http://kaiko.getalp.org/about-dbnary/
> http://kaiko.getalp.org/sparql
>
> can be used as a reference.
>
> I am basing my IEG project to visualize etymologies from Wktionary on it:
>
>
> https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etymology_dictionary_based_on_Wiktionary
> is based on at project/
>
> Cheers,
>
> Ester
>
> On Wed, Sep 21, 2016 at 7:39 PM, Daniel Kinzler <
> daniel.kinz...@wikimedia.de> wrote:
>
>> Am 21.09.2016 um 19:23 schrieb Eric Scott:
>> > A substantial amount of work in the LOD community seems to have gone
>> into Ontolex:
>> >
>> > https://www.w3.org/community/ontolex/wiki/Final_Model_Specification
>> >
>> > Is there any concern with aligning WD's model to this standard?
>>
>> Thanks for pointing to this!
>>
>> From a first look, the models seem to roughly align:
>>
>> What we call a "Lexeme" corresponds to a "Lexical Entry" in ontolex.
>> What we call a "Form" corresponds to a "Form" in ontolex.
>> What we call a "Sense" corresponds to a "Lexical Sense & Reference" in
>> ontolex,
>> although in ontolex, a reference to a Concept is required, while in our
>> model
>> that reference would be optional, but a natural language gloss is
>> required.
>>
>> So the models seem to match fine on a conceptual level. Perhaps someone
>> with
>> more expertise in RDF modeling can provide a more detailed analysis.
>>
>> --
>> Daniel Kinzler
>> Senior Software Developer
>>
>> Wikimedia Deutschland
>> Gesellschaft zur Förderung Freien Wissens e.V.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-19 Thread Denny Vrandečić

No, sorry, that is not what I meant. When I said "current data model" I
meant the "currently proposed data model". Sorry for being sloppy.

So it assumes new entity types for Lexemes, which are not just special
forms of Items. And it is not reliant on an underlying graph model.

Daniel already sketched out how 'product' may look like. Since the current
implementation does not support Lexemes, I can not just put it on Labs.

But there could be a Lexeme "produc-" which is pointed to from Daniel's
Lexemes for "to produce", "production", "producer", etc., which could point
to "produc-" via a statement, say, "root word" or similar. In the end, it
really is unclear whether that is correct or not, but it sure is a
possibility that can represented with the currently proposed data model.
Which properties exist, how they are linked to each other, etc., is all up
to the collaborative decisions which the community has to make.

On Mon, Sep 19, 2016 at 12:38 PM Thad Guidry  wrote:

> Denny,
>
> Ah. very cool.  So its currently supported just by the flexible nature of
> Wikidata's backing triplestore, Blazegraph and its generic graph structure,
> I assume what you mean.
>
> So just having statements perform the linking to Lexemes that are just Q
> items themselves, but with a special statement that says... 'I am not an
> entity, but instead a Lexeme".
>
> Can you or Daniel start with those few lexemes for 'Product' as Daniel and
> I mentioned , perhaps in Labs or somewhere, so that all of us can begin to
> see how this might work using statements ?
>
> Thad
> +ThadGuidry 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-19 Thread Denny Vrandečić

And just to point out - even though there are no plans to accommodate the
superstructures in the data model directly, it should be noted that the
current data model already is flexible to have it, i.e. if the community so
wishes they can create Lexemes which represent the "root" of a word like
"produc-" and then explicitly link these with statements from the Lexemes
for "production", "producer", etc. Or not. It could instead try to model it
with statements pertaining the etymology of the words. Or not.

The Wiktionary data model is not supposed to express a specific theory of
linguistics, just as the Wikidata data model is not supposed to express a
specific theory of ontology. It is supposed to be flexible enough to work
with whatever the community decides it wants to express, sometimes even
contradictory statements, and the ability to source them to references.



On Mon, Sep 19, 2016 at 6:05 AM Daniel Kinzler 
wrote:

> Am 16.09.2016 um 20:46 schrieb Thad Guidry:
> > Daniel,
> >
> > I wasn't trying to help solve the issues - I'll be quite now :)
> >
> > I was helping to expose one of your test cases :)
>
> Ha, sorry for sounding harsh, and thanks for pointing me to "product"!
> It's a
> good test case indeed.
>
> > 'product' is a lexeme - a headword - a basic unit of meaning that has a
> 'set of
> > forms' and those have 'a set of definitions'
>
> In the current model, a Lexeme has forms and senses. Forms don't have
> senses
> directly, the meanings should apply to all forms. This means lexemes have
> to be
> split with higher granularity:
>
> * product (English noun) would be one lexeme, with "products" being the
> plural
> form, and "product's" the genitive, and "products'" the plural genitive.
> Sense
> include the ones you mentioned.
> * (to) produce (English verb) would be another lexeme, with forms like
> "produces", "produced", "producing", etc, and senses meaning "to create",
> "to
> show", "to make available", etc
> * production (English noun) would be another lexeme, with other forms and
> senses.
> * produce (English noun) would be another
> * producer (English noun) would be another
> * produced (English adjective) another
> etc...
>
> These lexemes can be linked using some kind of "derived from" statements.
>
> > But a thought just occured to me...
> > A. In order to model this perhaps would be to have those headwords
> stored in
> > Wikidata.  Those headwords ideally would not actually be a Q or a P ...
> but what
> > about instead ... L  ?  Wrapping the graph structure itself ?  Pros /
> Cons ?
>
> That's the plan, yes: Have lexemes (L...) on wikidata, which wrap the
> structure
> of forms and senses, and has statements for the lexeme, as well as for
> each form
> and each sense.
>
> We don't currently plan a "super-structure" for wrapping derived/related
> lexemse
> (product, produce, production, etc). They would just be inter-linked by
> statements.
>
> > B.  or do we go with Daniel's suggestion of linking out to headwords and
> not
> > actually storing them in Wikidata ?  Pros / Cons ?
>
> The link I suggest is between items (Q...) and lexemes (L...), both on
> Wikidata.
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-18 Thread Denny Vrandečić

Can you figure out what a good limit would be for these two use cases? I.e.
what would support 99%, 99.9%, and 100%?

On Sun, Sep 18, 2016, 12:27 Egon Willighagen 
wrote:

> Hi all,
>
> sorry for joining the party late...
>
> On Tue, Sep 13, 2016 at 11:39 AM, Sebastian Burgstaller
>  wrote:
> > I think this topic might have been discussed many months ago. For
> > certain data types in the chemical compound space (P233, canonical
> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
> > that this might cause problems with SPARQL)). Are there any plans on
> > implementing this? In general, for quality assurance, many string
> > property types would profit from a fixed max string length.
>
> 400 characters is not a lot for chemicals... InChIs can be a lot
> larger indeed. 2k would allow us to capture a lot more chemicals. BTW,
> this also applies to the canonical SMILES, which also doesn't have an
> upper bound. Tannic acid (Q427956) is an example (which looking at the
> InChIKey came up when running the bot :) From working with ChEMBL as
> RDF I know it has InChIs of length > 1024, which was the max length in
> Virtuoso... I think it's important for the biology and chemistry to
> increase the limit.
>
> Egon
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Denny Vrandečić

Yes, that definitively is one promising approach (and I hope that we would
make a rough impact analysis before deciding on it and implementing it,
once the structures and data are there).

I wonder if there are other approaches that are somehow more subtle. But I
cannot express what I am looking for, and maybe yours is already
sufficiently close to optimal.


On Fri, Sep 16, 2016 at 11:00 AM Daniel Kinzler 
wrote:

> Am 16.09.2016 um 19:41 schrieb Denny Vrandečić:
> > Yes, there should be some connection between items and lexemes, but I am
> still
> > hazy about details on how exactly this should look like. If someone could
> > actually make a strawman proposal, that would be great.
> >
> > I think the connection should live in the statement space, and not be on
> the
> > level of labels, but that is just a hunch. I'd be happy to see proposals
> incoming.
>
> My thinking is this:
>
> On some Sense of a Lexeme, there is a Statement saying that this Sense
> refers to
> a given concept (Item). If the property for stating this is well-known, we
> can
> track the Sense-to-Item relationship in the database. We can then
> automatically
> show the lexeme's lemma as a (pseudo-)alias on the Item, and perhaps also
> use it
> (and maybe all forms of the lexeme!) for indexing the item for search.  So:
>
>   from ( Lexeme - Sense - Statement -> Item )
>   we can derive ( Item -> Lexeme - Forms )
>
> In the beginning of Wikidata, I was very reluctant about the software
> knowing
> about "magic" properties. Now I feel better about this, since wikidata
> properties are established as a permanent vocabulary that can be used by
> any
> software, including our own.
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-16 Thread Denny Vrandečić

(in particular because I expect that character limit to have to change for
Wiktionary in Wikidata)

On Fri, Sep 16, 2016 at 10:38 AM Denny Vrandečić 
wrote:

> Markus' description of the decision for the limit corresponds with mine. I
> also think that this decision can be revisited. I would still advice for
> caution, due to technical issues, but I am sure that the development team
> will make a well-informed decision on this. It would be sad if valid
> usecases could not be supported due to that.
>
> On Fri, Sep 16, 2016 at 6:51 AM Markus Kroetzsch <
> markus.kroetz...@tu-dresden.de> wrote:
>
>> On 13.09.2016 11:39, Sebastian Burgstaller wrote:
>> > Hi all,
>> >
>> > I think this topic might have been discussed many months ago. For
>> > certain data types in the chemical compound space (P233, canonical
>> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
>> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
>> > that this might cause problems with SPARQL)). Are there any plans on
>> > implementing this? In general, for quality assurance, many string
>> > property types would profit from a fixed max string length.
>>
>> FWIW, I recall that the main reason for the char limit originally was to
>> discourage the use of Wikidata for textual content. Simply put, we did
>> not want Wikipedia articles in the data. Long texts could also make
>> copyright/license issues more relevant (though, in theory, a copyrighted
>> poem could be rather short).
>>
>> However, given that we now have such a well informed community with
>> established practices and good quality checks, it seems unproblematic to
>> lift the character limit. I don't think there are major technical
>> reasons for having it. Surely, BlazeGraph (the WMF SPARQL engine) should
>> not expect texts to be short, and I would be surprised if they did. So I
>> would not expect problems on this side.
>>
>> Best,
>> Markus
>>
>>
>> >
>> > Best,
>> > Sebastian
>> >
>> > Sebastian Burgstaller-Muehlbacher, PhD
>> > Research Associate
>> > Andrew Su Lab
>> > MEM-216, Department of Molecular and Experimental Medicine
>> > The Scripps Research Institute
>> > 10550 North Torrey Pines Road
>> > La Jolla, CA 92037
>> > @sebotic
>> >
>> > ___
>> > Wikidata mailing list
>> > Wikidata@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Denny Vrandečić

Yes, there should be some connection between items and lexemes, but I am
still hazy about details on how exactly this should look like. If someone
could actually make a strawman proposal, that would be great.

I think the connection should live in the statement space, and not be on
the level of labels, but that is just a hunch. I'd be happy to see
proposals incoming.

On Thu, Sep 15, 2016 at 10:00 PM Gerard Meijssen 
wrote:

> Hoi,
> Please understand that for every label for a current item in Wikidata
> there should be one lexeme. It would be really helpful when all the new
> lexemes added are associated with labels. You will then be able to show an
> item with the conjugation as is preferred for a language.Currently this is
> not our practise.
>
> When we associate labels with lexemes, we have in fact the missing
> functionality like indicating that a specific lexeme was preferred up to a
> point. It allows for people to understand where "Batavia" was and why you
> will not find "Jakarta" in certain papers.
> Thanks,
>   GerardM
>
>
> On 15 September 2016 at 17:40, Jan Berkel  wrote:
>
>>
>> *- How wikidata and wiktionary databases will be synchronized?*
>> New entity types will be created in Wikidata database, with new ids (ex.
>> L for lexemes). A Wiktionary will have the possibility to include data from
>> Wikidata in their pages (the complete entity or only some chosen
>> statements, as the community decides)
>>
>>
>> The pdf mentions 4 new entity types: Lexeme, Statement, Form, Embedded
>> (?).  Curious, was the existing data model not flexible enough?
>>
>> Will these new entities be restricted to the usage in a lexicographical
>> context, i.e. Wiktionary? How will they fit into the existing data model,
>> will there be links from existing Wikidata items to the new entities? (i.e.
>> how will Wikidata benefit from the new data?)
>>
>> *- Will editing wiktionary change?*
>> Yes, changes will happen, but we're working on making editing Wiktionary
>> easier. Soon as we can provide some mockups, we will share them with you
>> for collecting feedbacks.
>>
>>
>> Making contributing to Wiktionary easier will be a huge help. Right now
>> the learning curve is extremely steep, and turning away potential
>> contributors.
>>
>> One thing to keep in mind is that Wiktionary is more than just the
>> content in the page namespace. A big part of what you see  is actually
>> generated dynamically, for example transliteration, pronunciation and
>> grammatical forms (conjugations, plurals etc).
>>
>> I imagine in an integrated Wikidata/Wiktionary world "content" and code
>> lives in various places, and we'll have a range of automated processes to
>> copy things back and forth, and to automatically create new entries derived
>> from existing ones?
>>
>> – Jan
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-16 Thread Denny Vrandečić

Markus' description of the decision for the limit corresponds with mine. I
also think that this decision can be revisited. I would still advice for
caution, due to technical issues, but I am sure that the development team
will make a well-informed decision on this. It would be sad if valid
usecases could not be supported due to that.

On Fri, Sep 16, 2016 at 6:51 AM Markus Kroetzsch <
markus.kroetz...@tu-dresden.de> wrote:

> On 13.09.2016 11:39, Sebastian Burgstaller wrote:
> > Hi all,
> >
> > I think this topic might have been discussed many months ago. For
> > certain data types in the chemical compound space (P233, canonical
> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
> > that this might cause problems with SPARQL)). Are there any plans on
> > implementing this? In general, for quality assurance, many string
> > property types would profit from a fixed max string length.
>
> FWIW, I recall that the main reason for the char limit originally was to
> discourage the use of Wikidata for textual content. Simply put, we did
> not want Wikipedia articles in the data. Long texts could also make
> copyright/license issues more relevant (though, in theory, a copyrighted
> poem could be rather short).
>
> However, given that we now have such a well informed community with
> established practices and good quality checks, it seems unproblematic to
> lift the character limit. I don't think there are major technical
> reasons for having it. Surely, BlazeGraph (the WMF SPARQL engine) should
> not expect texts to be short, and I would be surprised if they did. So I
> would not expect problems on this side.
>
> Best,
> Markus
>
>
> >
> > Best,
> > Sebastian
> >
> > Sebastian Burgstaller-Muehlbacher, PhD
> > Research Associate
> > Andrew Su Lab
> > MEM-216, Department of Molecular and Experimental Medicine
> > The Scripps Research Institute
> > 10550 North Torrey Pines Road
> > La Jolla, CA 92037
> > @sebotic
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Denny Vrandečić

\o/

On Tue, Sep 13, 2016 at 6:18 AM Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> Hey everyone :)
>
> Wiktionary is our third-largest sister project, both in term of active
> editors and readers. It is a unique resource, with the goal to provide
> a dictionary for every language, in every language. Since the
> beginning of Wikidata but increasingly over the past months I have
> been getting more and more requests for supporting Wiktionary and
> lexicographical data in Wikidata. Having this data available openly
> and freely licensed would be a major step forward in automated
> translation, text analysis, text generation and much more. It will
> enable and ease research. And most importantly it will enable the
> individual Wiktionary communities to work more closely together and
> benefit from each other’s work.
>
> With this and the increased demand to support Wikimedia Commons with
> Wikidata, we have looked at the bigger picture and our options. I am
> seeing a lot of overlap in the work we need to do to support
> Wiktionary and Commons. I am also seeing increasing pressure to store
> lexicographical data in existing items (which would be bad for many
> reasons).
>
> Because of this we will start implementing support for Wiktionary in
> parallel to Commons based on our annual plan and quarterly plans. We
> contacted several of our partners in order to get funding for this
> additional work. I am happy that Google agreed to provide funding
> (restricted to work on Wikidata). With this we can reorganize our team
> and set up one part of the team to continue working on building out
> the core of Wikidata and support for Wikipedia and Commons and the
> other part will concentrate on Wiktionary. (To support and to extend
> our work around Wikidata with the help of external funding sources was
> our plan in our annual plan 2016:
>
> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#Financials:_current_funding_period
> )
>
> As a next step I’d like us all to have another careful look at the
> latest proposal at
> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
> been online for input in its current form for a year and the first
> version is 3 years old now. So I am confident that the proposal is in
> a good shape to start implementation. However I’d like to do a last
> round of feedback with you all to make sure the concept really is
> sane. To make it easier to understand there is now also a pdf
> explaining the concept in a slightly different way:
>
> https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktionary_announcement.pdf
> Please do go ahead and review it. If you have comments or questions
> please leave them on the talk page of the latest proposal at
>
> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Development/Proposals/2015-05
> .
> I’d be especially interested in feedback from editors who are familiar
> with both Wiktionary and Wikidata.
>
> Getting support for Wiktionary done - just like for Commons - will
> take some time but I am really excited about the opportunities it will
> open up especially for languages that have so far not gotten much or
> any technological support.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Primary Sources Tool code has been moved to Wikidata org

2016-07-12 Thread Denny Vrandečić

Yes, the contributor agreement needs a refresh. The Google Contributor
License Agreement will not be a requirement. So far we only moved the repo,
it still needs to be updated in order to reflect the changes.

Pull requests welcome :)

Denny


On Tue, Jul 12, 2016 at 9:15 AM Thad Guidry  wrote:

> This contributor agreement needs a refresh ?
>
> https://github.com/Wikidata/primarysources/blob/master/CONTRIBUTING.md
>
>
> Thad
> +ThadGuidry 
>
> On Tue, Jul 12, 2016 at 10:47 AM, Marco Fossati 
> wrote:
>
>> Thanks for the heads-up, Lydia.
>> I assume future contributors won't have to sign a Google Contributor
>> License Agreement, right?
>>
>> Cheers,
>>
>> Marco
>>
>> On 7/12/16 17:11, Lydia Pintscher wrote:
>>
>>> Hey folks :)
>>>
>>> Based on requests here Denny and I have worked on getting the Primary
>>> Sources Tool code moved from the Google to the Wikidata organisation.
>>> This has now happened and it is available at
>>> https://github.com/Wikidata/primarysources from now on. I hope this
>>> will lead to more contributions from more people as I believe it is an
>>> important part of Wikidata's data flow.
>>>
>>>
>>> Cheers
>>> Lydia
>>>
>>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

1 2 >

1 - 100 of 119 matches

Mail list logo