[Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Lydia Pintscher
Hey everyone :)

Wiktionary is our third-largest sister project, both in term of active
editors and readers. It is a unique resource, with the goal to provide
a dictionary for every language, in every language. Since the
beginning of Wikidata but increasingly over the past months I have
been getting more and more requests for supporting Wiktionary and
lexicographical data in Wikidata. Having this data available openly
and freely licensed would be a major step forward in automated
translation, text analysis, text generation and much more. It will
enable and ease research. And most importantly it will enable the
individual Wiktionary communities to work more closely together and
benefit from each other’s work.

With this and the increased demand to support Wikimedia Commons with
Wikidata, we have looked at the bigger picture and our options. I am
seeing a lot of overlap in the work we need to do to support
Wiktionary and Commons. I am also seeing increasing pressure to store
lexicographical data in existing items (which would be bad for many
reasons).

Because of this we will start implementing support for Wiktionary in
parallel to Commons based on our annual plan and quarterly plans. We
contacted several of our partners in order to get funding for this
additional work. I am happy that Google agreed to provide funding
(restricted to work on Wikidata). With this we can reorganize our team
and set up one part of the team to continue working on building out
the core of Wikidata and support for Wikipedia and Commons and the
other part will concentrate on Wiktionary. (To support and to extend
our work around Wikidata with the help of external funding sources was
our plan in our annual plan 2016:
https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#Financials:_current_funding_period)

As a next step I’d like us all to have another careful look at the
latest proposal at
https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
been online for input in its current form for a year and the first
version is 3 years old now. So I am confident that the proposal is in
a good shape to start implementation. However I’d like to do a last
round of feedback with you all to make sure the concept really is
sane. To make it easier to understand there is now also a pdf
explaining the concept in a slightly different way:
https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktionary_announcement.pdf
Please do go ahead and review it. If you have comments or questions
please leave them on the talk page of the latest proposal at
https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Development/Proposals/2015-05.
I’d be especially interested in feedback from editors who are familiar
with both Wiktionary and Wikidata.

Getting support for Wiktionary done - just like for Commons - will
take some time but I am really excited about the opportunities it will
open up especially for languages that have so far not gotten much or
any technological support.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Gerard Meijssen
Hoi,
You assume that it is not good to have lexicological information in our
existing items. With Wiktionary support you bring such information on
board. It would be really awkward when for every concept there has to be an
item in two databases.

Why is there this problem with lexicologival information and how will the
current data be linked to the future "Wiktionary-data" information if there
are to be two databases?
Thanks,
 GerardM

PS I cannot find this question or an answer in the PDF.

On 13 September 2016 at 15:17, Lydia Pintscher  wrote:

> Hey everyone :)
>
> Wiktionary is our third-largest sister project, both in term of active
> editors and readers. It is a unique resource, with the goal to provide
> a dictionary for every language, in every language. Since the
> beginning of Wikidata but increasingly over the past months I have
> been getting more and more requests for supporting Wiktionary and
> lexicographical data in Wikidata. Having this data available openly
> and freely licensed would be a major step forward in automated
> translation, text analysis, text generation and much more. It will
> enable and ease research. And most importantly it will enable the
> individual Wiktionary communities to work more closely together and
> benefit from each other’s work.
>
> With this and the increased demand to support Wikimedia Commons with
> Wikidata, we have looked at the bigger picture and our options. I am
> seeing a lot of overlap in the work we need to do to support
> Wiktionary and Commons. I am also seeing increasing pressure to store
> lexicographical data in existing items (which would be bad for many
> reasons).
>
> Because of this we will start implementing support for Wiktionary in
> parallel to Commons based on our annual plan and quarterly plans. We
> contacted several of our partners in order to get funding for this
> additional work. I am happy that Google agreed to provide funding
> (restricted to work on Wikidata). With this we can reorganize our team
> and set up one part of the team to continue working on building out
> the core of Wikidata and support for Wikipedia and Commons and the
> other part will concentrate on Wiktionary. (To support and to extend
> our work around Wikidata with the help of external funding sources was
> our plan in our annual plan 2016:
> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/
> 2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_
> form#Financials:_current_funding_period)
>
> As a next step I’d like us all to have another careful look at the
> latest proposal at
> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
> been online for input in its current form for a year and the first
> version is 3 years old now. So I am confident that the proposal is in
> a good shape to start implementation. However I’d like to do a last
> round of feedback with you all to make sure the concept really is
> sane. To make it easier to understand there is now also a pdf
> explaining the concept in a slightly different way:
> https://commons.wikimedia.org/wiki/File:Wikidata_for_
> Wiktionary_announcement.pdf
> Please do go ahead and review it. If you have comments or questions
> please leave them on the talk page of the latest proposal at
> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/
> Development/Proposals/2015-05.
> I’d be especially interested in feedback from editors who are familiar
> with both Wiktionary and Wikidata.
>
> Getting support for Wiktionary done - just like for Commons - will
> take some time but I am really excited about the opportunities it will
> open up especially for languages that have so far not gotten much or
> any technological support.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Léa Lacroix
Hello Gerard,

We won't create a second database, only improve the existing one with new
types of entities, especially for Lexemes. They will have their own
specific structure, and will be linked to the concepts (items) by their
statements.

The statements you can make about a word are very different from the
statements you can make about a concept, so we will keep them separated.

Bests,

On 13 September 2016 at 15:37, Gerard Meijssen 
wrote:

> Hoi,
> You assume that it is not good to have lexicological information in our
> existing items. With Wiktionary support you bring such information on
> board. It would be really awkward when for every concept there has to be an
> item in two databases.
>
> Why is there this problem with lexicologival information and how will the
> current data be linked to the future "Wiktionary-data" information if there
> are to be two databases?
> Thanks,
>  GerardM
>
> PS I cannot find this question or an answer in the PDF.
>
> On 13 September 2016 at 15:17, Lydia Pintscher <
> lydia.pintsc...@wikimedia.de> wrote:
>
>> Hey everyone :)
>>
>> Wiktionary is our third-largest sister project, both in term of active
>> editors and readers. It is a unique resource, with the goal to provide
>> a dictionary for every language, in every language. Since the
>> beginning of Wikidata but increasingly over the past months I have
>> been getting more and more requests for supporting Wiktionary and
>> lexicographical data in Wikidata. Having this data available openly
>> and freely licensed would be a major step forward in automated
>> translation, text analysis, text generation and much more. It will
>> enable and ease research. And most importantly it will enable the
>> individual Wiktionary communities to work more closely together and
>> benefit from each other’s work.
>>
>> With this and the increased demand to support Wikimedia Commons with
>> Wikidata, we have looked at the bigger picture and our options. I am
>> seeing a lot of overlap in the work we need to do to support
>> Wiktionary and Commons. I am also seeing increasing pressure to store
>> lexicographical data in existing items (which would be bad for many
>> reasons).
>>
>> Because of this we will start implementing support for Wiktionary in
>> parallel to Commons based on our annual plan and quarterly plans. We
>> contacted several of our partners in order to get funding for this
>> additional work. I am happy that Google agreed to provide funding
>> (restricted to work on Wikidata). With this we can reorganize our team
>> and set up one part of the team to continue working on building out
>> the core of Wikidata and support for Wikipedia and Commons and the
>> other part will concentrate on Wiktionary. (To support and to extend
>> our work around Wikidata with the help of external funding sources was
>> our plan in our annual plan 2016:
>> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-
>> 2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#
>> Financials:_current_funding_period)
>>
>> As a next step I’d like us all to have another careful look at the
>> latest proposal at
>> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
>> been online for input in its current form for a year and the first
>> version is 3 years old now. So I am confident that the proposal is in
>> a good shape to start implementation. However I’d like to do a last
>> round of feedback with you all to make sure the concept really is
>> sane. To make it easier to understand there is now also a pdf
>> explaining the concept in a slightly different way:
>> https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktion
>> ary_announcement.pdf
>> Please do go ahead and review it. If you have comments or questions
>> please leave them on the talk page of the latest proposal at
>> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Devel
>> opment/Proposals/2015-05.
>> I’d be especially interested in feedback from editors who are familiar
>> with both Wiktionary and Wikidata.
>>
>> Getting support for Wiktionary done - just like for Commons - will
>> take some time but I am really excited about the opportunities it will
>> open up especially for languages that have so far not gotten much or
>> any technological support.
>>
>>
>> Cheers
>> Lydia
>>
>> --
>> Lydia Pintscher - http://about.me/lydia.pintscher
>> Product Manager for Wikidata
>>
>> Wikimedia Deutschland e.V.
>> Tempelhofer Ufer 23-24
>> 10963 Berlin
>> www.wikimedia.de
>>
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>>
>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> _

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Denny Vrandečić
\o/

On Tue, Sep 13, 2016 at 6:18 AM Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> Hey everyone :)
>
> Wiktionary is our third-largest sister project, both in term of active
> editors and readers. It is a unique resource, with the goal to provide
> a dictionary for every language, in every language. Since the
> beginning of Wikidata but increasingly over the past months I have
> been getting more and more requests for supporting Wiktionary and
> lexicographical data in Wikidata. Having this data available openly
> and freely licensed would be a major step forward in automated
> translation, text analysis, text generation and much more. It will
> enable and ease research. And most importantly it will enable the
> individual Wiktionary communities to work more closely together and
> benefit from each other’s work.
>
> With this and the increased demand to support Wikimedia Commons with
> Wikidata, we have looked at the bigger picture and our options. I am
> seeing a lot of overlap in the work we need to do to support
> Wiktionary and Commons. I am also seeing increasing pressure to store
> lexicographical data in existing items (which would be bad for many
> reasons).
>
> Because of this we will start implementing support for Wiktionary in
> parallel to Commons based on our annual plan and quarterly plans. We
> contacted several of our partners in order to get funding for this
> additional work. I am happy that Google agreed to provide funding
> (restricted to work on Wikidata). With this we can reorganize our team
> and set up one part of the team to continue working on building out
> the core of Wikidata and support for Wikipedia and Commons and the
> other part will concentrate on Wiktionary. (To support and to extend
> our work around Wikidata with the help of external funding sources was
> our plan in our annual plan 2016:
>
> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#Financials:_current_funding_period
> )
>
> As a next step I’d like us all to have another careful look at the
> latest proposal at
> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
> been online for input in its current form for a year and the first
> version is 3 years old now. So I am confident that the proposal is in
> a good shape to start implementation. However I’d like to do a last
> round of feedback with you all to make sure the concept really is
> sane. To make it easier to understand there is now also a pdf
> explaining the concept in a slightly different way:
>
> https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktionary_announcement.pdf
> Please do go ahead and review it. If you have comments or questions
> please leave them on the talk page of the latest proposal at
>
> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Development/Proposals/2015-05
> .
> I’d be especially interested in feedback from editors who are familiar
> with both Wiktionary and Wikidata.
>
> Getting support for Wiktionary done - just like for Commons - will
> take some time but I am really excited about the opportunities it will
> open up especially for languages that have so far not gotten much or
> any technological support.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Gerard Meijssen
Hoi,
The database design for OmegaWiki had a distinction between the concept and
all the derivatives for them. The lexemes were all connected to the concept
and independent of the spelling they are connected to the concept.
Obviously this is language dependent.

So bumblebee is more complex than just "instance of" noun. It is an English
noun. "Hommel" is connected as a Dutch noun for the same concept and
"hommels" is the Dutch plural...

Obviously.
Thanks,
  GerardM

On 13 September 2016 at 16:35, Daniel Kinzler 
wrote:

> Am 13.09.2016 um 15:37 schrieb Gerard Meijssen:
> > Hoi,
> > You assume that it is not good to have lexicological information in our
> existing
> > items. With Wiktionary support you bring such information on board. It
> would be
> > really awkward when for every concept there has to be an item in two
> databases.
>
> It will be two namespaces in the same project.
>
> But we will not duplicate items. The proposed structure is not
> concept-centered
> like Omegawiki is. It will be centered about lexemes, like Wiktionary is,
> but
> with a higher level of granularity (a lexeme corresponds to one
> "morphological"
> section on a Wiktionary page).
>
> > Why is there this problem with lexicologival information and how will the
> > current data be linked to the future "Wiktionary-data" information if
> there are
> > to be two databases?
>
> Because "bumblebee"  "noun" conflicts with "bumblebee"
>  "insect". They can't both be true for the same thing, because
> nouns are not insects. One is true for the word, the other is true for the
> concept. So they need to be treated separately.
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread James Forrester
On Tue, 13 Sep 2016 at 06:18 Lydia Pintscher 
wrote:

> Because of this we will start implementing support for Wiktionary in
> parallel to Commons based on our annual plan and quarterly plans.
>

Fantastic news. I'm hugely excited about this.

J.
-- 

James D. Forrester
Lead Product Manager, Editing
Wikimedia Foundation, Inc.
jforrester at wikimedia.org
 |
@jdforrester
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Luca Martinelli
This is great. :) Crossing fingers!

L.

2016-09-13 15:17 GMT+02:00 Lydia Pintscher :
> Hey everyone :)
>
> Wiktionary is our third-largest sister project, both in term of active
> editors and readers. It is a unique resource, with the goal to provide
> a dictionary for every language, in every language. Since the
> beginning of Wikidata but increasingly over the past months I have
> been getting more and more requests for supporting Wiktionary and
> lexicographical data in Wikidata. Having this data available openly
> and freely licensed would be a major step forward in automated
> translation, text analysis, text generation and much more. It will
> enable and ease research. And most importantly it will enable the
> individual Wiktionary communities to work more closely together and
> benefit from each other’s work.
>
> With this and the increased demand to support Wikimedia Commons with
> Wikidata, we have looked at the bigger picture and our options. I am
> seeing a lot of overlap in the work we need to do to support
> Wiktionary and Commons. I am also seeing increasing pressure to store
> lexicographical data in existing items (which would be bad for many
> reasons).
>
> Because of this we will start implementing support for Wiktionary in
> parallel to Commons based on our annual plan and quarterly plans. We
> contacted several of our partners in order to get funding for this
> additional work. I am happy that Google agreed to provide funding
> (restricted to work on Wikidata). With this we can reorganize our team
> and set up one part of the team to continue working on building out
> the core of Wikidata and support for Wikipedia and Commons and the
> other part will concentrate on Wiktionary. (To support and to extend
> our work around Wikidata with the help of external funding sources was
> our plan in our annual plan 2016:
> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#Financials:_current_funding_period)
>
> As a next step I’d like us all to have another careful look at the
> latest proposal at
> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
> been online for input in its current form for a year and the first
> version is 3 years old now. So I am confident that the proposal is in
> a good shape to start implementation. However I’d like to do a last
> round of feedback with you all to make sure the concept really is
> sane. To make it easier to understand there is now also a pdf
> explaining the concept in a slightly different way:
> https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktionary_announcement.pdf
> Please do go ahead and review it. If you have comments or questions
> please leave them on the talk page of the latest proposal at
> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Development/Proposals/2015-05.
> I’d be especially interested in feedback from editors who are familiar
> with both Wiktionary and Wikidata.
>
> Getting support for Wiktionary done - just like for Commons - will
> take some time but I am really excited about the opportunities it will
> open up especially for languages that have so far not gotten much or
> any technological support.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
Luca "Sannita" Martinelli
http://it.wikipedia.org/wiki/Utente:Sannita

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Jo
I'm really glad to see this. I have been an avid contributor to Wiktionary
for a few years, until about 10 years ago. Then Openstreetmap caught my
attention and Wiktionary became dull as it was mostly fighting vandalism at
some point.

I'm certainly going to follow up on this,

Polyglot

2016-09-13 15:17 GMT+02:00 Lydia Pintscher :

> Hey everyone :)
>
> Wiktionary is our third-largest sister project, both in term of active
> editors and readers. It is a unique resource, with the goal to provide
> a dictionary for every language, in every language. Since the
> beginning of Wikidata but increasingly over the past months I have
> been getting more and more requests for supporting Wiktionary and
> lexicographical data in Wikidata. Having this data available openly
> and freely licensed would be a major step forward in automated
> translation, text analysis, text generation and much more. It will
> enable and ease research. And most importantly it will enable the
> individual Wiktionary communities to work more closely together and
> benefit from each other’s work.
>
> With this and the increased demand to support Wikimedia Commons with
> Wikidata, we have looked at the bigger picture and our options. I am
> seeing a lot of overlap in the work we need to do to support
> Wiktionary and Commons. I am also seeing increasing pressure to store
> lexicographical data in existing items (which would be bad for many
> reasons).
>
> Because of this we will start implementing support for Wiktionary in
> parallel to Commons based on our annual plan and quarterly plans. We
> contacted several of our partners in order to get funding for this
> additional work. I am happy that Google agreed to provide funding
> (restricted to work on Wikidata). With this we can reorganize our team
> and set up one part of the team to continue working on building out
> the core of Wikidata and support for Wikipedia and Commons and the
> other part will concentrate on Wiktionary. (To support and to extend
> our work around Wikidata with the help of external funding sources was
> our plan in our annual plan 2016:
> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/
> 2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_
> form#Financials:_current_funding_period)
>
> As a next step I’d like us all to have another careful look at the
> latest proposal at
> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
> been online for input in its current form for a year and the first
> version is 3 years old now. So I am confident that the proposal is in
> a good shape to start implementation. However I’d like to do a last
> round of feedback with you all to make sure the concept really is
> sane. To make it easier to understand there is now also a pdf
> explaining the concept in a slightly different way:
> https://commons.wikimedia.org/wiki/File:Wikidata_for_
> Wiktionary_announcement.pdf
> Please do go ahead and review it. If you have comments or questions
> please leave them on the talk page of the latest proposal at
> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/
> Development/Proposals/2015-05.
> I’d be especially interested in feedback from editors who are familiar
> with both Wiktionary and Wikidata.
>
> Getting support for Wiktionary done - just like for Commons - will
> take some time but I am really excited about the opportunities it will
> open up especially for languages that have so far not gotten much or
> any technological support.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Daniel Kinzler
Am 13.09.2016 um 17:16 schrieb Gerard Meijssen:
> Hoi,
> The database design for OmegaWiki had a distinction between the concept and 
> all
> the derivatives for them.

Wikidata will have Lexemes and their Forms and Senses.

> So bumblebee is more complex than just "instance of" noun. It is an English
> noun. "Hommel" is connected as a Dutch noun for the same concept and "hommels"
> is the Dutch plural...

Wikidata would have a Lexeme for "bumblebee" (english noun) and one for "Hommel"
(dutch noun). Both would have a sense that would describe them as a flying
insect (and perhaps other word senses, such as Q1626135, a creater on the moon).
The senses that refer to the flying insect would be considered translations of
each other, and both senses would refer to the same concept.

So "bumblebee" (insect) is a translation of "Hommel" (insect), and both refer to
the genus Bombus (Q25407). "Hommel" (creater) would share the morphology of
"Hommel" (insect), as it has the same forms (I assume), but it won't share the
translations.

Having lexeme-specific word-senses avoids the loss of connotation and nuance
that you get when you force words of different languages on a shared meaning.
The effect of referring to the same concept can still be achieved via the
reference to a concept (item).

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Gerard Meijssen
Hoi,
The main thing to remember is that all these lexemes are in fact the labels
we currently hold. The relatisation that this is true is key.
Thanks,
  Gerard

On 13 September 2016 at 18:30, Daniel Kinzler 
wrote:

> Am 13.09.2016 um 17:16 schrieb Gerard Meijssen:
> > Hoi,
> > The database design for OmegaWiki had a distinction between the concept
> and all
> > the derivatives for them.
>
> Wikidata will have Lexemes and their Forms and Senses.
>
> > So bumblebee is more complex than just "instance of" noun. It is an
> English
> > noun. "Hommel" is connected as a Dutch noun for the same concept and
> "hommels"
> > is the Dutch plural...
>
> Wikidata would have a Lexeme for "bumblebee" (english noun) and one for
> "Hommel"
> (dutch noun). Both would have a sense that would describe them as a flying
> insect (and perhaps other word senses, such as Q1626135, a creater on the
> moon).
> The senses that refer to the flying insect would be considered
> translations of
> each other, and both senses would refer to the same concept.
>
> So "bumblebee" (insect) is a translation of "Hommel" (insect), and both
> refer to
> the genus Bombus (Q25407). "Hommel" (creater) would share the morphology of
> "Hommel" (insect), as it has the same forms (I assume), but it won't share
> the
> translations.
>
> Having lexeme-specific word-senses avoids the loss of connotation and
> nuance
> that you get when you force words of different languages on a shared
> meaning.
> The effect of referring to the same concept can still be achieved via the
> reference to a concept (item).
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Jakob Voß
Hey everyone,

Happy to hear that Wiktionary will make it into Wikidata. Looks like a
long but promising way to go. I just happened to edit Wikidata items
about dictionary types as part of my work on taxonomy extraction (see
presentation http://dx.doi.org/10.5281/zenodo.61767, paper
http://ceur-ws.org/Vol-1676/paper2.pdf, and tool
https://www.npmjs.com/package/wikidata-taxonomy). The current state of
dictionary types and instances in Wikidata is given below, created with
"wdtaxonomy Q23622" on the command line.

I welcome everbody familiar with and/or interested in lexicography to
improve this classification (there is no WikiProject lexicography yet).
One problem of Wiktionary is that it tries to subsume all kinds of
dictionaries, making it less practical for specific use cases. I bet
that Wiktionary data in Wikidata will allow to extract special kinds of
dictionaries by selection of Wikidata properties and queries some day in
the future. Before this I hope that we also have some better
understanding of dictionary types.

Cheers
Jakob

dictionary (Q23622) •126 ×279 ↑↑
├──lexicon (Q8096) •38 ×4 ↑
├──lexicographic thesaurus (Q179797) •48 ×7
│  └──synonym dictionary (Q2376111) •1
├──orthographic dictionary (Q378914) •7
├──etymological dictionary (Q521983) •13 ×2
├──frequency list (Q697133) •6
├──glossary (Q859161) •35 ×14 ↑
├──visual dictionary (Q861712) •7 ×1
├──explanatory dictionary (Q897755) •6 ×4
│  ├──explanatory combinatorial dictionary (Q4459737) •2
│  └──monolingual learner's dictionary (Q6901667) •2
│ └──Advanced learner's dictionary (Q17011199) •2
├──encyclopedic dictionary (Q975413) •9 ×43 ↑
│  └──biographical encyclopedia (Q1787111) •9 ×69 ↑
│ ├──group biography (Q1499601) •1 ×1
│ ├──national biography (Q21050458) •1 ×1
│ ├──epoch biography (Q21050912) •1
│ └──dictionary of people (Q26721650) •3
├──pronouncing dictionary (Q1048400) •6
├──reverse dictionary (Q1304223) •9
├──machine-readable dictionary (Q1327461) •10
│  └──online dictionary (Q3327521) •6 ×9
│ └──Wiktionary language edition (Q22001389) ×7 ↑↑
├──single-field dictionary (Q1391417) •3 ×3 ↑
│  ├──law dictionary (Q1464287) •5
│  └──medical dictionary (Q6806507) •2 ×1
├──dictionary of foreign words (Q1455182) •1
├──concise dictionary (Q1575315) •1
├──idioticon (Q1656835) •2 ×1
├──??? (Q1722340) •1
├──learner's dictionary (Q1820290) •1
│  ╘══monolingual learner's dictionary (Q6901667) •2 …
├──??? (Q2134855) •1
├──rime dictionary (Q2191807) •7 ×4
├──rhyming dictionary (Q2210568) •11
├──conceptual dictionary (Q2361647) •6
╞══synonym dictionary (Q2376111) •1 …
├──pocket dictionary (Q2394934) •1
├──bilingual dictionary (Q2640207) •13 ×2
├──slang dictionary (Q3808854) •3 ×2
├──idiom dictionary (Q4492301) •5
├──Anagram dictionary (Q4750851) •1
├──author dictionary (Q5805540) •1
├──language for specific purposes dictionary (Q6486734) •1
├──phonetic dictionary (Q7187214) •1
├──picture dictionary (Q7191193) •2
├──specialized dictionary (Q7574915) •2
│  └──sub-field dictionary (Q7630614) •1
├──defining vocabulary (Q15192747) •1 ↑
└──multi-field dictionary (Q17094463) •1


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Amirouche Boubekki

Héllo,

I am very happy of this news.

I a wiki newbie interested in using wikidata to do text analysis.
I try to follow the discussion here and on french wiktionary.

I take this as opportunity to try to sum up some concerns that are 
raised on french wiktionary [0]:


- How wikidata and wiktionary databases will be synchronized?

- Will editing wiktionary change? The concern is that this will make 
editing wiktionary more difficult for people.


- Also, what about bots. Will bots be allowed/able to edit wiktionary 
pages after the support of wikidata in wiktionary?


- Another concern is that if edits are done in some wiktionary and that 
edit has an impact on another wiktionary. People will have trouble to 
reconcil their opinion given they don't speak the same language. Can an 
edit in a wiktionary A break wiktionary B?


I understand that wikidata requires new code to support the organisation 
of new relations between the data. I understand that with wikidata it 
will be easy to create interwiki links and thesaurus kind of pages but 
what else can provide wikidata to wiktionary?


[0] https://fr.wiktionary.org/wiki/Projet:Coop%C3%A9ration/Wikidata

Thanks,

i⋅am⋅amz3

On 2016-09-13 15:17, Lydia Pintscher wrote:

Hey everyone :)

Wiktionary is our third-largest sister project, both in term of active
editors and readers. It is a unique resource, with the goal to provide
a dictionary for every language, in every language. Since the
beginning of Wikidata but increasingly over the past months I have
been getting more and more requests for supporting Wiktionary and
lexicographical data in Wikidata. Having this data available openly
and freely licensed would be a major step forward in automated
translation, text analysis, text generation and much more. It will
enable and ease research. And most importantly it will enable the
individual Wiktionary communities to work more closely together and
benefit from each other’s work.

With this and the increased demand to support Wikimedia Commons with
Wikidata, we have looked at the bigger picture and our options. I am
seeing a lot of overlap in the work we need to do to support
Wiktionary and Commons. I am also seeing increasing pressure to store
lexicographical data in existing items (which would be bad for many
reasons).

Because of this we will start implementing support for Wiktionary in
parallel to Commons based on our annual plan and quarterly plans. We
contacted several of our partners in order to get funding for this
additional work. I am happy that Google agreed to provide funding
(restricted to work on Wikidata). With this we can reorganize our team
and set up one part of the team to continue working on building out
the core of Wikidata and support for Wikipedia and Commons and the
other part will concentrate on Wiktionary. (To support and to extend
our work around Wikidata with the help of external funding sources was
our plan in our annual plan 2016:
https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#Financials:_current_funding_period)

As a next step I’d like us all to have another careful look at the
latest proposal at
https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
been online for input in its current form for a year and the first
version is 3 years old now. So I am confident that the proposal is in
a good shape to start implementation. However I’d like to do a last
round of feedback with you all to make sure the concept really is
sane. To make it easier to understand there is now also a pdf
explaining the concept in a slightly different way:
https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktionary_announcement.pdf
Please do go ahead and review it. If you have comments or questions
please leave them on the talk page of the latest proposal at
https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Development/Proposals/2015-05.
I’d be especially interested in feedback from editors who are familiar
with both Wiktionary and Wikidata.

Getting support for Wiktionary done - just like for Commons - will
take some time but I am really excited about the opportunities it will
open up especially for languages that have so far not gotten much or
any technological support.


Cheers
Lydia

--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Amirouche ~ amz3 ~ http://www.hyperdev.fr

___
Wikidata maili

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Daniel Kinzler
Am 13.09.2016 um 15:37 schrieb Gerard Meijssen:
> Hoi,
> You assume that it is not good to have lexicological information in our 
> existing
> items. With Wiktionary support you bring such information on board. It would 
> be
> really awkward when for every concept there has to be an item in two 
> databases.

It will be two namespaces in the same project.

But we will not duplicate items. The proposed structure is not concept-centered
like Omegawiki is. It will be centered about lexemes, like Wiktionary is, but
with a higher level of granularity (a lexeme corresponds to one "morphological"
section on a Wiktionary page).

> Why is there this problem with lexicologival information and how will the
> current data be linked to the future "Wiktionary-data" information if there 
> are
> to be two databases?

Because "bumblebee"  "noun" conflicts with "bumblebee"
 "insect". They can't both be true for the same thing, because
nouns are not insects. One is true for the word, the other is true for the
concept. So they need to be treated separately.

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-13 Thread Scott MacLeod
Great news, and thanks!

How will Wiktionary with Wikidata anticipate "voice" and "voice in
translation" (for example in Google Voice or in other parallel projects)?
And how will Wiktionary/Wikidata also anticipate all 7,097 living languages
(Ethnologue) / 7943 entries under languages (Glottolog) and re machine
translation? (I'll look at the plan more closely about this).

Thank you,
Scott



On Tue, Sep 13, 2016 at 1:53 PM, Amirouche Boubekki  wrote:

> Héllo,
>
> I am very happy of this news.
>
> I a wiki newbie interested in using wikidata to do text analysis.
> I try to follow the discussion here and on french wiktionary.
>
> I take this as opportunity to try to sum up some concerns that are raised
> on french wiktionary [0]:
>
> - How wikidata and wiktionary databases will be synchronized?
>
> - Will editing wiktionary change? The concern is that this will make
> editing wiktionary more difficult for people.
>
> - Also, what about bots. Will bots be allowed/able to edit wiktionary
> pages after the support of wikidata in wiktionary?
>
> - Another concern is that if edits are done in some wiktionary and that
> edit has an impact on another wiktionary. People will have trouble to
> reconcil their opinion given they don't speak the same language. Can an
> edit in a wiktionary A break wiktionary B?
>
> I understand that wikidata requires new code to support the organisation
> of new relations between the data. I understand that with wikidata it will
> be easy to create interwiki links and thesaurus kind of pages but what else
> can provide wikidata to wiktionary?
>
> [0] https://fr.wiktionary.org/wiki/Projet:Coop%C3%A9ration/Wikidata
>
> Thanks,
>
> i⋅am⋅amz3
>
>
> On 2016-09-13 15:17, Lydia Pintscher wrote:
>
>> Hey everyone :)
>>
>> Wiktionary is our third-largest sister project, both in term of active
>> editors and readers. It is a unique resource, with the goal to provide
>> a dictionary for every language, in every language. Since the
>> beginning of Wikidata but increasingly over the past months I have
>> been getting more and more requests for supporting Wiktionary and
>> lexicographical data in Wikidata. Having this data available openly
>> and freely licensed would be a major step forward in automated
>> translation, text analysis, text generation and much more. It will
>> enable and ease research. And most importantly it will enable the
>> individual Wiktionary communities to work more closely together and
>> benefit from each other’s work.
>>
>> With this and the increased demand to support Wikimedia Commons with
>> Wikidata, we have looked at the bigger picture and our options. I am
>> seeing a lot of overlap in the work we need to do to support
>> Wiktionary and Commons. I am also seeing increasing pressure to store
>> lexicographical data in existing items (which would be bad for many
>> reasons).
>>
>> Because of this we will start implementing support for Wiktionary in
>> parallel to Commons based on our annual plan and quarterly plans. We
>> contacted several of our partners in order to get funding for this
>> additional work. I am happy that Google agreed to provide funding
>> (restricted to work on Wikidata). With this we can reorganize our team
>> and set up one part of the team to continue working on building out
>> the core of Wikidata and support for Wikipedia and Commons and the
>> other part will concentrate on Wiktionary. (To support and to extend
>> our work around Wikidata with the help of external funding sources was
>> our plan in our annual plan 2016:
>> https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-
>> 2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#
>> Financials:_current_funding_period)
>>
>> As a next step I’d like us all to have another careful look at the
>> latest proposal at
>> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
>> been online for input in its current form for a year and the first
>> version is 3 years old now. So I am confident that the proposal is in
>> a good shape to start implementation. However I’d like to do a last
>> round of feedback with you all to make sure the concept really is
>> sane. To make it easier to understand there is now also a pdf
>> explaining the concept in a slightly different way:
>> https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktion
>> ary_announcement.pdf
>> Please do go ahead and review it. If you have comments or questions
>> please leave them on the talk page of the latest proposal at
>> https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Devel
>> opment/Proposals/2015-05.
>> I’d be especially interested in feedback from editors who are familiar
>> with both Wiktionary and Wikidata.
>>
>> Getting support for Wiktionary done - just like for Commons - will
>> take some time but I am really excited about the opportunities it will
>> open up especially for languages that have so far not gotten much or
>> any technological support.
>>
>>
>> Cheers
>> Lydia
>>
>> --
>> Lydia P

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-14 Thread Léa Lacroix
Hello,

Thanks a lot for your questions and feedbacks. Here are some answers, I
hope these will be useful.

*- How wikidata and wiktionary databases will be synchronized?*
New entity types will be created in Wikidata database, with new ids (ex. L
for lexemes). A Wiktionary will have the possibility to include data from
Wikidata in their pages (the complete entity or only some chosen
statements, as the community decides)

*- Will editing wiktionary change?*
Yes, changes will happen, but we're working on making editing Wiktionary
easier. Soon as we can provide some mockups, we will share them with you
for collecting feedbacks.

*- Will bots be allowed/able to edit wiktionary pages after the support of
wikidata in wiktionary?*
Yes, of course, we want the data to be machine-readable and editable, and
with the new structure, bots will be able to edit data stored in Wikidata
and still Wiktionary pages.

*- Can an edit in a wiktionary A break wiktionary B?*
Data about lexemes will be stored on Wikidata, and Wiktionaries will choose
if they want to use this data, which part of it and how. Yes, if an
information stored in Wikidata and displayed on several Wiktionaries is
modified, via Wikidata or a Wiktionary interface, this will affect all the
pages where the information is included.
Because Wikidata is a multilingual project, we already have to deal with
the language issue, and we hope that with the increase of the numbers of
editors coming from Wikidata and Wiktionaries, it will become easier to
find people with at least one common language to communicate between the
different projects.

*- What else can provide wikidata to wiktionary?*
Machine-readable data will allow users to create new tools, useful for
editors, based on the communities' needs. By helping the different
communities (Wiktionaries and Wikidata) working together on the same
project, we expect a growth of the number of people editing the
lexicographical data, providing more review and a better quality of the
data. Finally, when centralized and structured, the data will be easily
reusable by third parties, other websites or applications... and give a
better visibility of the volunteers' work.

On 13 September 2016 at 22:53, Amirouche Boubekki 
wrote:

> Héllo,
>
> I am very happy of this news.
>
> I a wiki newbie interested in using wikidata to do text analysis.
> I try to follow the discussion here and on french wiktionary.
>
> I take this as opportunity to try to sum up some concerns that are raised
> on french wiktionary [0]:
>
> - How wikidata and wiktionary databases will be synchronized?
>
> - Will editing wiktionary change? The concern is that this will make
> editing wiktionary more difficult for people.
>
> - Also, what about bots. Will bots be allowed/able to edit wiktionary
> pages after the support of wikidata in wiktionary?
>
> - Another concern is that if edits are done in some wiktionary and that
> edit has an impact on another wiktionary. People will have trouble to
> reconcil their opinion given they don't speak the same language. Can an
> edit in a wiktionary A break wiktionary B?
>
> I understand that wikidata requires new code to support the organisation
> of new relations between the data. I understand that with wikidata it will
> be easy to create interwiki links and thesaurus kind of pages but what else
> can provide wikidata to wiktionary?
>
> [0] https://fr.wiktionary.org/wiki/Projet:Coop%C3%A9ration/Wikidata
>
> Thanks,
>
> i⋅am⋅amz3
>
>
> On 2016-09-13 15:17, Lydia Pintscher wrote:
>
>> Hey everyone :)
>>
>> Wiktionary is our third-largest sister project, both in term of active
>> editors and readers. It is a unique resource, with the goal to provide
>> a dictionary for every language, in every language. Since the
>> beginning of Wikidata but increasingly over the past months I have
>> been getting more and more requests for supporting Wiktionary and
>> lexicographical data in Wikidata. Having this data available openly
>> and freely licensed would be a major step forward in automated
>> translation, text analysis, text generation and much more. It will
>> enable and ease research. And most importantly it will enable the
>> individual Wiktionary communities to work more closely together and
>> benefit from each other’s work.
>>
>> With this and the increased demand to support Wikimedia Commons with
>> Wikidata, we have looked at the bigger picture and our options. I am
>> seeing a lot of overlap in the work we need to do to support
>> Wiktionary and Commons. I am also seeing increasing pressure to store
>> lexicographical data in existing items (which would be bad for many
>> reasons).
>>
>> Because of this we will start implementing support for Wiktionary in
>> parallel to Commons based on our annual plan and quarterly plans. We
>> contacted several of our partners in order to get funding for this
>> additional work. I am happy that Google agreed to provide funding
>> (restricted to work on Wikidata). With this we can re

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-14 Thread Daniel Kinzler
Am 14.09.2016 um 10:51 schrieb Léa Lacroix:
> /- What else can provide wikidata to wiktionary?/
> Machine-readable data will allow users to create new tools, useful for 
> editors,
> based on the communities' needs. By helping the different communities
> (Wiktionaries and Wikidata) working together on the same project, we expect a
> growth of the number of people editing the lexicographical data, providing 
> more
> review and a better quality of the data. Finally, when centralized and
> structured, the data will be easily reusable by third parties, other websites 
> or
> applications... and give a better visibility of the volunteers' work.

Here are some examples of things that will become possible with the new 
structure:

* the fact that the English word "sleeper" may refer to a railway tie, and in
which regions this is the case, only has to be entered once, not separately in
each Wiktionary.

* the fact that "Stuhl" is the German translation of (a specific sense of) the
English word "chair" only has to be entered once, not separately in each 
Wiktionary.

* by connecting lexeme-sense to concepts (items), it will become possible to
automatically search for potential synonyms and translations to other languages.

* by providing a statement defining the morphological class of a lexeme, it
becomes possible to automatically generate derived forms for display and search

* different representations (spellings, scripts) of a lexeme can be covered by a
single entry, information about word senses does not have to be repeated.

* the search interface will know about languages and word types, so you can
search specifically for "french verb dormir" (or perhaps more technical "lang:fr
a:Q24905 dormir")

* Similarly, you can search for or filter by epoch, region, linguistic
convention or methodology, etc.


> - Will editing wiktionary change?
> Yes, changes will happen, but we're working on making editing Wiktionary
> easier. Soon as we can provide some mockups, we will share them with you for
> collecting feedbacks.

The question is if you consider editing wikitext with complex nested templates
"easy" or not. With wikidata, editing would be form-based, with input fields and
suggestions. This makes it a lot easier especially for new editors. And even for
experienced editors, I think it's more convenient for editing individual bits of
information.

The form-based approach is less convenient when you want to enter a lot of
information at once. The solution is to identify the use cases for this, and
provide a specialized interface for that use case. This does not have to depend
on Wikibase developers, it can also be done by wiki users using gadgets,
Labs-based tools, or even bots.


> Because Wikidata is a multilingual project, we already have to deal with the
> language issue, and we hope that with the increase of the numbers of editors
> coming from Wikidata and Wiktionaries, it will become easier to find people 
> with
> at least one common language to communicate between the different projects.

Interestingly, we found that on wikidata there is rarely a conflict about
whether a statement about an item should say X or Y, e.g. whether Chelsea
Manning's gender should be given as "transgender female" or just "female" or
even "male". The conflict does not arise because you can and should simply add
all three, and use qualifiers and source references to specify who claimed which
of these, and for which period of time.

Long discussions do take place about the overall organization of information on
wikidata, about which properties to have and how to use them, about whether
substances like "ethanol" should be considered subclasses or instance of classes
like "alcohol".

I agree however that cross-lingual discussions are indeed an issue, and finding
techniques and strategies to help with communication between  the speakers of
different languages will be a challenge. But isn't the Wiktionary community
perfectly equipped for just that challenge? Isn't it just the crowd you would
ask if you had to solve a problem like this? I would (along perhaps with the
folks from translatewiki.net).


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-14 Thread Marco Fossati

Hi everyone,

FYI, there is an ongoing Wikimedia IEG project (main grantee in CC), 
which seems to be following a related direction:

https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etymology_dictionary_based_on_Wiktionary

Its first phase will translate Wiktionary into machine-readable data.
I think it is worth to consider reusing its outcome if possible, since 
it may fit into the Wikidata data model:

https://meta.wikimedia.org/wiki/Grants_talk:IEG/A_graphical_and_interactive_etymology_dictionary_based_on_Wiktionary#Translation_to_the_Wikidata_Data_Model

Cheers,

Marco

On 9/14/16 10:51, Léa Lacroix wrote:

Hello,

Thanks a lot for your questions and feedbacks. Here are some answers, I
hope these will be useful.

/- How wikidata and wiktionary databases will be synchronized?/
New entity types will be created in Wikidata database, with new ids (ex.
L for lexemes). A Wiktionary will have the possibility to include data
from Wikidata in their pages (the complete entity or only some chosen
statements, as the community decides)

/- Will editing wiktionary change?/
Yes, changes will happen, but we're working on making editing Wiktionary
easier. Soon as we can provide some mockups, we will share them with you
for collecting feedbacks.

/- Will bots be allowed/able to edit wiktionary pages after the support
of wikidata in wiktionary?/
Yes, of course, we want the data to be machine-readable and editable,
and with the new structure, bots will be able to edit data stored in
Wikidata and still Wiktionary pages.

/- Can an edit in a wiktionary A break wiktionary B?/
Data about lexemes will be stored on Wikidata, and Wiktionaries will
choose if they want to use this data, which part of it and how. Yes, if
an information stored in Wikidata and displayed on several Wiktionaries
is modified, via Wikidata or a Wiktionary interface, this will affect
all the pages where the information is included.
Because Wikidata is a multilingual project, we already have to deal with
the language issue, and we hope that with the increase of the numbers of
editors coming from Wikidata and Wiktionaries, it will become easier to
find people with at least one common language to communicate between the
different projects.

/- What else can provide wikidata to wiktionary?/
Machine-readable data will allow users to create new tools, useful for
editors, based on the communities' needs. By helping the different
communities (Wiktionaries and Wikidata) working together on the same
project, we expect a growth of the number of people editing the
lexicographical data, providing more review and a better quality of the
data. Finally, when centralized and structured, the data will be easily
reusable by third parties, other websites or applications... and give a
better visibility of the volunteers' work.

On 13 September 2016 at 22:53, Amirouche Boubekki
mailto:amirou...@hypermove.net>> wrote:

Héllo,

I am very happy of this news.

I a wiki newbie interested in using wikidata to do text analysis.
I try to follow the discussion here and on french wiktionary.

I take this as opportunity to try to sum up some concerns that are
raised on french wiktionary [0]:

- How wikidata and wiktionary databases will be synchronized?

- Will editing wiktionary change? The concern is that this will make
editing wiktionary more difficult for people.

- Also, what about bots. Will bots be allowed/able to edit
wiktionary pages after the support of wikidata in wiktionary?

- Another concern is that if edits are done in some wiktionary and
that edit has an impact on another wiktionary. People will have
trouble to reconcil their opinion given they don't speak the same
language. Can an edit in a wiktionary A break wiktionary B?

I understand that wikidata requires new code to support the
organisation of new relations between the data. I understand that
with wikidata it will be easy to create interwiki links and
thesaurus kind of pages but what else can provide wikidata to
wiktionary?

[0] https://fr.wiktionary.org/wiki/Projet:Coop%C3%A9ration/Wikidata


Thanks,

i⋅am⋅amz3


On 2016-09-13 15:17, Lydia Pintscher wrote:

Hey everyone :)

Wiktionary is our third-largest sister project, both in term of
active
editors and readers. It is a unique resource, with the goal to
provide
a dictionary for every language, in every language. Since the
beginning of Wikidata but increasingly over the past months I have
been getting more and more requests for supporting Wiktionary and
lexicographical data in Wikidata. Having this data available openly
and freely licensed would be a major step forward in automated
translation, text analysis, text generation and much more. It will
enable and ease research. And most im

Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-15 Thread Jan Berkel

> *- How wikidata and wiktionary databases will be synchronized?*
> New entity types will be created in Wikidata database, with new ids
> (ex. L for lexemes). A Wiktionary will have the possibility to include
> data from Wikidata in their pages (the complete entity or only some
> chosen statements, as the community decides)

The pdf mentions 4 new entity types: Lexeme, Statement, Form, Embedded
(?).  Curious, was the existing data model not flexible enough?

Will these new entities be restricted to the usage in a lexicographical
context, i.e. Wiktionary? How will they fit into the existing data
model, will there be links from existing Wikidata items to the new
entities? (i.e. how will Wikidata benefit from the new data?)

> *- Will editing wiktionary change?*
> Yes, changes will happen, but we're working on making editing
> Wiktionary easier. Soon as we can provide some mockups, we will share
> them with you for collecting feedbacks.

Making contributing to Wiktionary easier will be a huge help. Right now
the learning curve is extremely steep, and turning away potential
contributors.

One thing to keep in mind is that Wiktionary is more than just the
content in the page namespace. A big part of what you see  is actually
generated dynamically, for example transliteration, pronunciation and
grammatical forms (conjugations, plurals etc).

I imagine in an integrated Wikidata/Wiktionary world "content" and code
lives in various places, and we'll have a range of automated processes
to copy things back and forth, and to automatically create new entries
derived from existing ones?

– Jan
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-15 Thread Gerard Meijssen
Hoi,
Please understand that for every label for a current item in Wikidata there
should be one lexeme. It would be really helpful when all the new lexemes
added are associated with labels. You will then be able to show an item
with the conjugation as is preferred for a language.Currently this is not
our practise.

When we associate labels with lexemes, we have in fact the missing
functionality like indicating that a specific lexeme was preferred up to a
point. It allows for people to understand where "Batavia" was and why you
will not find "Jakarta" in certain papers.
Thanks,
  GerardM

On 15 September 2016 at 17:40, Jan Berkel  wrote:

>
> *- How wikidata and wiktionary databases will be synchronized?*
> New entity types will be created in Wikidata database, with new ids (ex. L
> for lexemes). A Wiktionary will have the possibility to include data from
> Wikidata in their pages (the complete entity or only some chosen
> statements, as the community decides)
>
>
> The pdf mentions 4 new entity types: Lexeme, Statement, Form, Embedded
> (?).  Curious, was the existing data model not flexible enough?
>
> Will these new entities be restricted to the usage in a lexicographical
> context, i.e. Wiktionary? How will they fit into the existing data model,
> will there be links from existing Wikidata items to the new entities? (i.e.
> how will Wikidata benefit from the new data?)
>
> *- Will editing wiktionary change?*
> Yes, changes will happen, but we're working on making editing Wiktionary
> easier. Soon as we can provide some mockups, we will share them with you
> for collecting feedbacks.
>
>
> Making contributing to Wiktionary easier will be a huge help. Right now
> the learning curve is extremely steep, and turning away potential
> contributors.
>
> One thing to keep in mind is that Wiktionary is more than just the content
> in the page namespace. A big part of what you see  is actually generated
> dynamically, for example transliteration, pronunciation and grammatical
> forms (conjugations, plurals etc).
>
> I imagine in an integrated Wikidata/Wiktionary world "content" and code
> lives in various places, and we'll have a range of automated processes to
> copy things back and forth, and to automatically create new entries derived
> from existing ones?
>
> – Jan
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Denny Vrandečić
Yes, there should be some connection between items and lexemes, but I am
still hazy about details on how exactly this should look like. If someone
could actually make a strawman proposal, that would be great.

I think the connection should live in the statement space, and not be on
the level of labels, but that is just a hunch. I'd be happy to see
proposals incoming.

On Thu, Sep 15, 2016 at 10:00 PM Gerard Meijssen 
wrote:

> Hoi,
> Please understand that for every label for a current item in Wikidata
> there should be one lexeme. It would be really helpful when all the new
> lexemes added are associated with labels. You will then be able to show an
> item with the conjugation as is preferred for a language.Currently this is
> not our practise.
>
> When we associate labels with lexemes, we have in fact the missing
> functionality like indicating that a specific lexeme was preferred up to a
> point. It allows for people to understand where "Batavia" was and why you
> will not find "Jakarta" in certain papers.
> Thanks,
>   GerardM
>
>
> On 15 September 2016 at 17:40, Jan Berkel  wrote:
>
>>
>> *- How wikidata and wiktionary databases will be synchronized?*
>> New entity types will be created in Wikidata database, with new ids (ex.
>> L for lexemes). A Wiktionary will have the possibility to include data from
>> Wikidata in their pages (the complete entity or only some chosen
>> statements, as the community decides)
>>
>>
>> The pdf mentions 4 new entity types: Lexeme, Statement, Form, Embedded
>> (?).  Curious, was the existing data model not flexible enough?
>>
>> Will these new entities be restricted to the usage in a lexicographical
>> context, i.e. Wiktionary? How will they fit into the existing data model,
>> will there be links from existing Wikidata items to the new entities? (i.e.
>> how will Wikidata benefit from the new data?)
>>
>> *- Will editing wiktionary change?*
>> Yes, changes will happen, but we're working on making editing Wiktionary
>> easier. Soon as we can provide some mockups, we will share them with you
>> for collecting feedbacks.
>>
>>
>> Making contributing to Wiktionary easier will be a huge help. Right now
>> the learning curve is extremely steep, and turning away potential
>> contributors.
>>
>> One thing to keep in mind is that Wiktionary is more than just the
>> content in the page namespace. A big part of what you see  is actually
>> generated dynamically, for example transliteration, pronunciation and
>> grammatical forms (conjugations, plurals etc).
>>
>> I imagine in an integrated Wikidata/Wiktionary world "content" and code
>> lives in various places, and we'll have a range of automated processes to
>> copy things back and forth, and to automatically create new entries derived
>> from existing ones?
>>
>> – Jan
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Daniel Kinzler
Am 16.09.2016 um 19:41 schrieb Denny Vrandečić:
> Yes, there should be some connection between items and lexemes, but I am still
> hazy about details on how exactly this should look like. If someone could
> actually make a strawman proposal, that would be great.
> 
> I think the connection should live in the statement space, and not be on the
> level of labels, but that is just a hunch. I'd be happy to see proposals 
> incoming.

My thinking is this:

On some Sense of a Lexeme, there is a Statement saying that this Sense refers to
a given concept (Item). If the property for stating this is well-known, we can
track the Sense-to-Item relationship in the database. We can then automatically
show the lexeme's lemma as a (pseudo-)alias on the Item, and perhaps also use it
(and maybe all forms of the lexeme!) for indexing the item for search.  So:

  from ( Lexeme - Sense - Statement -> Item )
  we can derive ( Item -> Lexeme - Forms )

In the beginning of Wikidata, I was very reluctant about the software knowing
about "magic" properties. Now I feel better about this, since wikidata
properties are established as a permanent vocabulary that can be used by any
software, including our own.

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Thad Guidry
Denny,

I would suggest to use https://en.wiktionary.org/wiki/product as that
strawman proposal.  Because it has 2 levels of Senses.
  3. Anything that is produced (contains 6 sub-senses)

To apply Daniel's thinking... I wonder...

his Lexeme Sense = Wordnet Sense ?
his Item -> Lexeme - Forms = Wordnet Synset ?

Thad
+ThadGuidry 
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Daniel Kinzler
Am 16.09.2016 um 20:11 schrieb Thad Guidry:
> Denny,
> 
> I would suggest to use https://en.wiktionary.org/wiki/product as that strawman
> proposal.  Because it has 2 levels of Senses.
>   3. Anything that is produced (contains 6 sub-senses)

Modelling sub-senses is a completely different can of worms. The proposed model
doesn't allow this directly (we try to avoid recursive structures), but it can
be done using statements.

Your example doesn't really say anything about how lexemes could be connected to
items as labels/aliases, which is, i believe, what Gerard and Denny were 
discussing.


My usage of "Sense" and "From" follows

which in turn follows the LEMON model .

Synsets are not directly modeled, but it's possible to construct them via
statements.

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Daniel Kinzler
Quick clarification:

Am 15.09.2016 um 17:40 schrieb Jan Berkel:
> The pdf mentions 4 new entity types: Lexeme, Statement, Form, Embedded (?).

"Embedded" isn't a separate type, it refers to the fact that Senses and Forms
are stored on the same page as "their" Lexeme. "Statement" isn't an entity, I
assume you meant to write "Sense".

>  Curious, was the existing data model not flexible enough?

It was not expressive enough, no; it would be possible to use items to model
lexemes, but it would be very annoying and complicated. You would need separate
items for each form and sense, and need to keep track of them for deletion,
undeletion, etc.

> Will these new entities be restricted to the usage in a lexicographical 
> context,
> i.e. Wiktionary? 

It will also be accessible from Wikipedia and other wikis.

> How will they fit into the existing data model, will there be
> links from existing Wikidata items to the new entities? (i.e. how will 
> Wikidata
> benefit from the new data?)

Yes, there will be cross-linking.

> I imagine in an integrated Wikidata/Wiktionary world "content" and code lives 
> in
> various places, and we'll have a range of automated processes to copy things
> back and forth, and to automatically create new entries derived from existing 
> ones?

It would be transcluded and generated, like with templates and and Lua. Not so
much copied, with bots.


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Denny Vrandečić
Yes, that definitively is one promising approach (and I hope that we would
make a rough impact analysis before deciding on it and implementing it,
once the structures and data are there).

I wonder if there are other approaches that are somehow more subtle. But I
cannot express what I am looking for, and maybe yours is already
sufficiently close to optimal.


On Fri, Sep 16, 2016 at 11:00 AM Daniel Kinzler 
wrote:

> Am 16.09.2016 um 19:41 schrieb Denny Vrandečić:
> > Yes, there should be some connection between items and lexemes, but I am
> still
> > hazy about details on how exactly this should look like. If someone could
> > actually make a strawman proposal, that would be great.
> >
> > I think the connection should live in the statement space, and not be on
> the
> > level of labels, but that is just a hunch. I'd be happy to see proposals
> incoming.
>
> My thinking is this:
>
> On some Sense of a Lexeme, there is a Statement saying that this Sense
> refers to
> a given concept (Item). If the property for stating this is well-known, we
> can
> track the Sense-to-Item relationship in the database. We can then
> automatically
> show the lexeme's lemma as a (pseudo-)alias on the Item, and perhaps also
> use it
> (and maybe all forms of the lexeme!) for indexing the item for search.  So:
>
>   from ( Lexeme - Sense - Statement -> Item )
>   we can derive ( Item -> Lexeme - Forms )
>
> In the beginning of Wikidata, I was very reluctant about the software
> knowing
> about "magic" properties. Now I feel better about this, since wikidata
> properties are established as a permanent vocabulary that can be used by
> any
> software, including our own.
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-16 Thread Thad Guidry
Daniel,

I wasn't trying to help solve the issues - I'll be quite now :)

​I was helping to expose one of your test cases :)​

'product' is a lexeme - a headword - a basic unit of meaning that has a
'set of forms' and those have 'a set of definitions'
  Here are some of its forms:
 1. product
 2. products
 3. producing
 4. production
 5. ...

​Here are some definitions against PRODUCT:
1. product - ​a commodity offered for sale.
2. product - a quantity obtained by multiplication of two or more
numbers
3. product - anything that is produced.

​But a thought just occured to me...
A. In order to model this perhaps would be to have those headwords stored
in Wikidata.  Those headwords ideally would not actually be a Q or a P ...
but what about instead ... L​  ?  Wrapping the graph structure itself ?
Pros / Cons ?

B.  or do we go with Daniel's suggestion of linking out to headwords and
not actually storing them in Wikidata ?  Pros / Cons ?

Thad
+ThadGuidry 
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-18 Thread Gerard Meijssen
Hoi,
It does make sense to link it to labels for many reasons.
* Every word can safely exist in a dictionary, in Wikidata.. Typically
names are not included in a dictionary but they could.
* Labels are written based on the rules and exceptions of a language. These
are different in every language and sometimes even within dialects. The way
content is displayed in a dictionary differs from country to country as
well.
* Concepts are not very stable and the writing a label in a language is not
stable either. The one thing that binds them is the label; often a change
in a label and a change in the concept go together.
* Synonyms differ in label not in concept.

I REALLY wonder why you think you can do this with statements.. In my
opinion you can not do that without sacrifice.
Thanks,
  GerardM

On 16 September 2016 at 19:41, Denny Vrandečić  wrote:

> Yes, there should be some connection between items and lexemes, but I am
> still hazy about details on how exactly this should look like. If someone
> could actually make a strawman proposal, that would be great.
>
> I think the connection should live in the statement space, and not be on
> the level of labels, but that is just a hunch. I'd be happy to see
> proposals incoming.
>
> On Thu, Sep 15, 2016 at 10:00 PM Gerard Meijssen <
> gerard.meijs...@gmail.com> wrote:
>
>> Hoi,
>> Please understand that for every label for a current item in Wikidata
>> there should be one lexeme. It would be really helpful when all the new
>> lexemes added are associated with labels. You will then be able to show an
>> item with the conjugation as is preferred for a language.Currently this is
>> not our practise.
>>
>> When we associate labels with lexemes, we have in fact the missing
>> functionality like indicating that a specific lexeme was preferred up to a
>> point. It allows for people to understand where "Batavia" was and why you
>> will not find "Jakarta" in certain papers.
>> Thanks,
>>   GerardM
>>
>>
>> On 15 September 2016 at 17:40, Jan Berkel  wrote:
>>
>>>
>>> *- How wikidata and wiktionary databases will be synchronized?*
>>> New entity types will be created in Wikidata database, with new ids (ex.
>>> L for lexemes). A Wiktionary will have the possibility to include data from
>>> Wikidata in their pages (the complete entity or only some chosen
>>> statements, as the community decides)
>>>
>>>
>>> The pdf mentions 4 new entity types: Lexeme, Statement, Form, Embedded
>>> (?).  Curious, was the existing data model not flexible enough?
>>>
>>> Will these new entities be restricted to the usage in a lexicographical
>>> context, i.e. Wiktionary? How will they fit into the existing data model,
>>> will there be links from existing Wikidata items to the new entities? (i.e.
>>> how will Wikidata benefit from the new data?)
>>>
>>> *- Will editing wiktionary change?*
>>> Yes, changes will happen, but we're working on making editing Wiktionary
>>> easier. Soon as we can provide some mockups, we will share them with you
>>> for collecting feedbacks.
>>>
>>>
>>> Making contributing to Wiktionary easier will be a huge help. Right now
>>> the learning curve is extremely steep, and turning away potential
>>> contributors.
>>>
>>> One thing to keep in mind is that Wiktionary is more than just the
>>> content in the page namespace. A big part of what you see  is actually
>>> generated dynamically, for example transliteration, pronunciation and
>>> grammatical forms (conjugations, plurals etc).
>>>
>>> I imagine in an integrated Wikidata/Wiktionary world "content" and code
>>> lives in various places, and we'll have a range of automated processes to
>>> copy things back and forth, and to automatically create new entries derived
>>> from existing ones?
>>>
>>> – Jan
>>>
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-19 Thread Daniel Kinzler
Am 16.09.2016 um 20:46 schrieb Thad Guidry:
> Daniel,
> 
> I wasn't trying to help solve the issues - I'll be quite now :)
> 
> ​I was helping to expose one of your test cases :)​

Ha, sorry for sounding harsh, and thanks for pointing me to "product"! It's a
good test case indeed.

> 'product' is a lexeme - a headword - a basic unit of meaning that has a 'set 
> of
> forms' and those have 'a set of definitions'

In the current model, a Lexeme has forms and senses. Forms don't have senses
directly, the meanings should apply to all forms. This means lexemes have to be
split with higher granularity:

* product (English noun) would be one lexeme, with "products" being the plural
form, and "product's" the genitive, and "products'" the plural genitive. Sense
include the ones you mentioned.
* (to) produce (English verb) would be another lexeme, with forms like
"produces", "produced", "producing", etc, and senses meaning "to create", "to
show", "to make available", etc
* production (English noun) would be another lexeme, with other forms and 
senses.
* produce (English noun) would be another
* producer (English noun) would be another
* produced (English adjective) another
etc...

These lexemes can be linked using some kind of "derived from" statements.

> ​But a thought just occured to me...
> A. In order to model this perhaps would be to have those headwords stored in
> Wikidata.  Those headwords ideally would not actually be a Q or a P ... but 
> what
> about instead ... L​  ?  Wrapping the graph structure itself ?  Pros / Cons ?

That's the plan, yes: Have lexemes (L...) on wikidata, which wrap the structure
of forms and senses, and has statements for the lexeme, as well as for each form
and each sense.

We don't currently plan a "super-structure" for wrapping derived/related lexemse
(product, produce, production, etc). They would just be inter-linked by 
statements.

> B.  or do we go with Daniel's suggestion of linking out to headwords and not
> actually storing them in Wikidata ?  Pros / Cons ?

The link I suggest is between items (Q...) and lexemes (L...), both on Wikidata.

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-19 Thread Denny Vrandečić
And just to point out - even though there are no plans to accommodate the
superstructures in the data model directly, it should be noted that the
current data model already is flexible to have it, i.e. if the community so
wishes they can create Lexemes which represent the "root" of a word like
"produc-" and then explicitly link these with statements from the Lexemes
for "production", "producer", etc. Or not. It could instead try to model it
with statements pertaining the etymology of the words. Or not.

The Wiktionary data model is not supposed to express a specific theory of
linguistics, just as the Wikidata data model is not supposed to express a
specific theory of ontology. It is supposed to be flexible enough to work
with whatever the community decides it wants to express, sometimes even
contradictory statements, and the ability to source them to references.



On Mon, Sep 19, 2016 at 6:05 AM Daniel Kinzler 
wrote:

> Am 16.09.2016 um 20:46 schrieb Thad Guidry:
> > Daniel,
> >
> > I wasn't trying to help solve the issues - I'll be quite now :)
> >
> > ​I was helping to expose one of your test cases :)​
>
> Ha, sorry for sounding harsh, and thanks for pointing me to "product"!
> It's a
> good test case indeed.
>
> > 'product' is a lexeme - a headword - a basic unit of meaning that has a
> 'set of
> > forms' and those have 'a set of definitions'
>
> In the current model, a Lexeme has forms and senses. Forms don't have
> senses
> directly, the meanings should apply to all forms. This means lexemes have
> to be
> split with higher granularity:
>
> * product (English noun) would be one lexeme, with "products" being the
> plural
> form, and "product's" the genitive, and "products'" the plural genitive.
> Sense
> include the ones you mentioned.
> * (to) produce (English verb) would be another lexeme, with forms like
> "produces", "produced", "producing", etc, and senses meaning "to create",
> "to
> show", "to make available", etc
> * production (English noun) would be another lexeme, with other forms and
> senses.
> * produce (English noun) would be another
> * producer (English noun) would be another
> * produced (English adjective) another
> etc...
>
> These lexemes can be linked using some kind of "derived from" statements.
>
> > ​But a thought just occured to me...
> > A. In order to model this perhaps would be to have those headwords
> stored in
> > Wikidata.  Those headwords ideally would not actually be a Q or a P ...
> but what
> > about instead ... L​  ?  Wrapping the graph structure itself ?  Pros /
> Cons ?
>
> That's the plan, yes: Have lexemes (L...) on wikidata, which wrap the
> structure
> of forms and senses, and has statements for the lexeme, as well as for
> each form
> and each sense.
>
> We don't currently plan a "super-structure" for wrapping derived/related
> lexemse
> (product, produce, production, etc). They would just be inter-linked by
> statements.
>
> > B.  or do we go with Daniel's suggestion of linking out to headwords and
> not
> > actually storing them in Wikidata ?  Pros / Cons ?
>
> The link I suggest is between items (Q...) and lexemes (L...), both on
> Wikidata.
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-19 Thread Thad Guidry
Denny,

Ah. very cool.  So its currently supported just by the flexible nature of
Wikidata's backing triplestore, Blazegraph and its generic graph structure,
I assume what you mean.

So just having statements perform the linking to Lexemes that are just Q
items themselves, but with a special statement that says... 'I am not an
entity, but instead a Lexeme".

Can you or Daniel start with those few lexemes for 'Product' as Daniel and
I mentioned , perhaps in Labs or somewhere, so that all of us can begin to
see how this might work using statements ?

Thad
+ThadGuidry 
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-19 Thread Denny Vrandečić
No, sorry, that is not what I meant. When I said "current data model" I
meant the "currently proposed data model". Sorry for being sloppy.

So it assumes new entity types for Lexemes, which are not just special
forms of Items. And it is not reliant on an underlying graph model.

Daniel already sketched out how 'product' may look like. Since the current
implementation does not support Lexemes, I can not just put it on Labs.

But there could be a Lexeme "produc-" which is pointed to from Daniel's
Lexemes for "to produce", "production", "producer", etc., which could point
to "produc-" via a statement, say, "root word" or similar. In the end, it
really is unclear whether that is correct or not, but it sure is a
possibility that can represented with the currently proposed data model.
Which properties exist, how they are linked to each other, etc., is all up
to the collaborative decisions which the community has to make.


On Mon, Sep 19, 2016 at 12:38 PM Thad Guidry  wrote:

> Denny,
>
> Ah. very cool.  So its currently supported just by the flexible nature of
> Wikidata's backing triplestore, Blazegraph and its generic graph structure,
> I assume what you mean.
>
> So just having statements perform the linking to Lexemes that are just Q
> items themselves, but with a special statement that says... 'I am not an
> entity, but instead a Lexeme".
>
> Can you or Daniel start with those few lexemes for 'Product' as Daniel and
> I mentioned , perhaps in Labs or somewhere, so that all of us can begin to
> see how this might work using statements ?
>
> Thad
> +ThadGuidry 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-21 Thread Eric Scott
A substantial amount of work in the LOD community seems to have gone 
into Ontolex:


https://www.w3.org/community/ontolex/wiki/Final_Model_Specification

Is there any concern with aligning WD's model to this standard?

Cheers,

Eric Scott

On 09/13/2016 06:17 AM, Lydia Pintscher wrote:

Hey everyone :)

Wiktionary is our third-largest sister project, both in term of active
editors and readers. It is a unique resource, with the goal to provide
a dictionary for every language, in every language. Since the
beginning of Wikidata but increasingly over the past months I have
been getting more and more requests for supporting Wiktionary and
lexicographical data in Wikidata. Having this data available openly
and freely licensed would be a major step forward in automated
translation, text analysis, text generation and much more. It will
enable and ease research. And most importantly it will enable the
individual Wiktionary communities to work more closely together and
benefit from each other’s work.

With this and the increased demand to support Wikimedia Commons with
Wikidata, we have looked at the bigger picture and our options. I am
seeing a lot of overlap in the work we need to do to support
Wiktionary and Commons. I am also seeing increasing pressure to store
lexicographical data in existing items (which would be bad for many
reasons).

Because of this we will start implementing support for Wiktionary in
parallel to Commons based on our annual plan and quarterly plans. We
contacted several of our partners in order to get funding for this
additional work. I am happy that Google agreed to provide funding
(restricted to work on Wikidata). With this we can reorganize our team
and set up one part of the team to continue working on building out
the core of Wikidata and support for Wikipedia and Commons and the
other part will concentrate on Wiktionary. (To support and to extend
our work around Wikidata with the help of external funding sources was
our plan in our annual plan 2016:
https://meta.wikimedia.org/wiki/Grants:APG/Proposals/2015-2016_round1/Wikimedia_Deutschland_e.V./Proposal_form#Financials:_current_funding_period)

As a next step I’d like us all to have another careful look at the
latest proposal at
https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development. It has
been online for input in its current form for a year and the first
version is 3 years old now. So I am confident that the proposal is in
a good shape to start implementation. However I’d like to do a last
round of feedback with you all to make sure the concept really is
sane. To make it easier to understand there is now also a pdf
explaining the concept in a slightly different way:
https://commons.wikimedia.org/wiki/File:Wikidata_for_Wiktionary_announcement.pdf
Please do go ahead and review it. If you have comments or questions
please leave them on the talk page of the latest proposal at
https://www.wikidata.org/wiki/Wikidata_talk:Wiktionary/Development/Proposals/2015-05.
I’d be especially interested in feedback from editors who are familiar
with both Wiktionary and Wikidata.

Getting support for Wiktionary done - just like for Commons - will
take some time but I am really excited about the opportunities it will
open up especially for languages that have so far not gotten much or
any technological support.


Cheers
Lydia




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-21 Thread Daniel Kinzler
Am 21.09.2016 um 19:23 schrieb Eric Scott:
> A substantial amount of work in the LOD community seems to have gone into 
> Ontolex:
> 
> https://www.w3.org/community/ontolex/wiki/Final_Model_Specification
> 
> Is there any concern with aligning WD's model to this standard?

Thanks for pointing to this!

From a first look, the models seem to roughly align:

What we call a "Lexeme" corresponds to a "Lexical Entry" in ontolex.
What we call a "Form" corresponds to a "Form" in ontolex.
What we call a "Sense" corresponds to a "Lexical Sense & Reference" in ontolex,
although in ontolex, a reference to a Concept is required, while in our model
that reference would be optional, but a natural language gloss is required.

So the models seem to match fine on a conceptual level. Perhaps someone with
more expertise in RDF modeling can provide a more detailed analysis.

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-21 Thread Ester Pantaleo
The DBnary project "Wiktionary as Linguistic Linked Open Data" at:

http://kaiko.getalp.org/about-dbnary/
http://kaiko.getalp.org/sparql

can be used as a reference.

I am basing my IEG project to visualize etymologies from Wktionary on it:

https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etymology_dictionary_based_on_Wiktionary
is based on at project/

Cheers,

Ester

On Wed, Sep 21, 2016 at 7:39 PM, Daniel Kinzler  wrote:

> Am 21.09.2016 um 19:23 schrieb Eric Scott:
> > A substantial amount of work in the LOD community seems to have gone
> into Ontolex:
> >
> > https://www.w3.org/community/ontolex/wiki/Final_Model_Specification
> >
> > Is there any concern with aligning WD's model to this standard?
>
> Thanks for pointing to this!
>
> From a first look, the models seem to roughly align:
>
> What we call a "Lexeme" corresponds to a "Lexical Entry" in ontolex.
> What we call a "Form" corresponds to a "Form" in ontolex.
> What we call a "Sense" corresponds to a "Lexical Sense & Reference" in
> ontolex,
> although in ontolex, a reference to a Concept is required, while in our
> model
> that reference would be optional, but a natural language gloss is required.
>
> So the models seem to match fine on a conceptual level. Perhaps someone
> with
> more expertise in RDF modeling can provide a more detailed analysis.
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Let's move forward with support for Wiktionary

2016-09-21 Thread Denny Vrandečić
It is no coincidence that the Wikidata Wiktionary data model and OntoLex
fit well together: Wikidata was informed and followed the Lemon data model
closely, and OntoLex also is rooted in Lemon.

It's good that both built on the same solid results from linguistics ;)


On Wed, Sep 21, 2016 at 10:55 AM Ester Pantaleo 
wrote:

> The DBnary project "Wiktionary as Linguistic Linked Open Data" at:
>
> http://kaiko.getalp.org/about-dbnary/
> http://kaiko.getalp.org/sparql
>
> can be used as a reference.
>
> I am basing my IEG project to visualize etymologies from Wktionary on it:
>
>
> https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etymology_dictionary_based_on_Wiktionary
> is based on at project/
>
> Cheers,
>
> Ester
>
> On Wed, Sep 21, 2016 at 7:39 PM, Daniel Kinzler <
> daniel.kinz...@wikimedia.de> wrote:
>
>> Am 21.09.2016 um 19:23 schrieb Eric Scott:
>> > A substantial amount of work in the LOD community seems to have gone
>> into Ontolex:
>> >
>> > https://www.w3.org/community/ontolex/wiki/Final_Model_Specification
>> >
>> > Is there any concern with aligning WD's model to this standard?
>>
>> Thanks for pointing to this!
>>
>> From a first look, the models seem to roughly align:
>>
>> What we call a "Lexeme" corresponds to a "Lexical Entry" in ontolex.
>> What we call a "Form" corresponds to a "Form" in ontolex.
>> What we call a "Sense" corresponds to a "Lexical Sense & Reference" in
>> ontolex,
>> although in ontolex, a reference to a Concept is required, while in our
>> model
>> that reference would be optional, but a natural language gloss is
>> required.
>>
>> So the models seem to match fine on a conceptual level. Perhaps someone
>> with
>> more expertise in RDF modeling can provide a more detailed analysis.
>>
>> --
>> Daniel Kinzler
>> Senior Software Developer
>>
>> Wikimedia Deutschland
>> Gesellschaft zur Förderung Freien Wissens e.V.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata