Re: [Wikidata] Goal: Establish a framework to engage with data engineers and open data organizations

2015-07-01 Thread Quim Gil
Thank you very much for this quick and very diverse wave of feedback. I'm
trying to keep the description of https://phabricator.wikimedia.org/T101950
up to date, and you are welcome to edit too.

The deliverable of this goal is basically documentation, so the first
question is where should that documentation live. Somewhere under
https://www.wikidata.org/wiki/Wikidata: ? In Meta? In Outreach? Elsewhere?
As soon as we have an answer we can start documenting on-wiki, business as
usual.

We need a name for this "framework". It can be a working name used for
practical reasons. I'm starting to use Wiki Loves Open Data / WLOD, but
only in my brain (now it's the first time that I write it down). Please
propose better alternatives if you have them.

For now we will keep the discussion here, but people willing to get
involved as a contributor / stakeholder should get ready to follow the work
in Phabricator. We will start with subtasks blocking T101950, jumping to an
own project only if needed.

Some replies to feedback received:

* If you are involved or in touch with an open data organization, you are
encouraged to add yourselves to "Organizations interested" at
https://phabricator.wikimedia.org/T101950.

* Having real cases with real problems is the best way to focus on what
really matters. There is a section "The problem" where we should list the
problems that are blocking or making difficult fruitful contributions to
Wikidata.

* Where to publish entire datasets... Something tells me that this is not
the most urgent and important problem that we have, but the community
definitely knows better, so correct me if I'm wrong. I think our main use
case are organizations having certain types of data that would be great to
have in Wikipedia via Wikidata. Whatever is the most urgent problem, we
need to be very selective in the scope we want to address in this quarterly
goal.

* The experience of maintainers of current projects like ProteinBoxBot is
very very useful. Please do not hesitate getting into details, creating
subtasks to address specific problems, etc.

* Scott, I have the impression that we need to define this framework a bit
before projects like yours can jump in and find their ways for
contributing. I don't have a good reply to you right now, and this goal is
precisely about offering a framework useful to you, useful to Wikimedia.

Looking forward to your feedback and to start editing on a wiki page soon.

On Wed, Jul 1, 2015 at 8:09 PM, Info WorldUniversity <
i...@worlduniversityandschool.org> wrote:

> Hi Quim, Sylvia, Lydia and Wikidatans,
>
> In terms of strategic partnerships with CC Wikidata, and this Wikidata 
> Engineering
> Community project looking for -
>
>  "* people that have been in touch with organizations willing to
> contribute their open data" -
>
> in what ways could CC World University and School contribute our early CC
> Nation State Universities, e.g. Mexico World University and School -
> http://worlduniversity.wikia.com/wiki/Mexico - (with only the CC MIT OCW
> in Spanish so far, and not yet in Spanish otherwise)? See also, for
> example, the Nation States' wiki subject page at WUaS -
> http://worlduniversity.wikia.com/wiki/Nation_States. Each wiki
> Nation-State University at CC WUaS, seeks to accredit in each country in
> their main languags to offer online CC baccalaureate, Ph.D., Law and M.D.
> degrees, and also I.B. high school degrees, if possible, based on CC MIT
> OCW in 7 languages and CC Yale OYC, to begin.
>
> In addition, CC World University would like to facilitate wiki / Wikidata
> schools for open teaching and learning in each of all 7,929+ languages. As
> planned interlingual wiki schools for highest quality universal education,
> CC World University and School would like to explore developing in
> Wikidata, and become a growth story also for Wikidata. World University
> would thus like to explore contributing our open data in these regards.
>
> In what ways might WUaS best contribute in terms of a strategic
> partnerships re this Wikidata Engineering Community project ? Perhaps
> World University could become a part of the"community framework allowing
> Wikidata content and tech contributors, data engineers, and open data
> organizations to collaborate effectively." Thank you.
>
> Best,
> Info (Scott)
>
>
>
> On Wed, Jul 1, 2015 at 9:48 AM, Benjamin Good 
> wrote:
>
>> Quim,
>>
>> I'm not familiar with GLAM or what you are really asking for here.  Could
>> you elaborate a little?  Our group is actively engaged in writing bots for
>> populating wikidata with trusted biomedical information and for using that
>> information to drive applications such as Wikipedia.  Processes for making
>> this easier would be most welcome.  A lot of what we are doing and hoping
>> to do is described on this bot page:
>> https://www.wikidata.org/wiki/User:ProteinBoxBot
>>
>> ?
>> -Ben
>>
>>
>>
>> On Wed, Jul 1, 2015 at 9:23 AM, David Cuenca Tudela 
>> wrote:
>>
>>> Hello Quim,
>>>
>>> Th

Re: [Wikidata] calendar model screwup

2015-07-01 Thread Peter F. Patel-Schneider
Thanks.  That helps a lot.  Is that the way that things are going to be done
in the future, i.e., dates will be stored using the specified calendar model
instead of being converted?

peter


On 07/01/2015 10:52 AM, Denny Vrandečić wrote:
> Peter,
> 
> you might be looking for this:
> 
> https://www.mediawiki.org/wiki/Wikibase/DataModel#Dates_and_times
> 
> Cheers,
> Denny
> 
> On Wed, Jul 1, 2015 at 9:48 AM Peter F. Patel-Schneider
> mailto:pfpschnei...@gmail.com>> wrote:
> 
> Thanks.
> 
> This helps in finding out how to reproduce the numbers.
> 
> However, I'm still confused as to how these bits of data are part of the
> Wikidata data/knowledge model.  Where is the description of
> getPreferredCalendarModel, for example?
> 
> 
> http://javadox.com/org.wikidata.wdtk/wdtk-datamodel/0.1.0/org/wikidata/wdtk/datamodel/interfaces/TimeValue.html
>  Is a *partial* description of what is going on.  Changes to this document
> would be somewhat useful.  However, what I'm really looking for is a
> description of how time works in Wikidata.
> 
> peter
> 
> PS:  I note that there are lots of aspects of TimeValue that are only
> suitable for the Gregorian and Julian calendars.
> 
> 
> 
> On 07/01/2015 09:24 AM, Markus Krötzsch wrote:
> > On 01.07.2015 18:03, Peter F. Patel-Schneider wrote: ...
> >>
> >> Even the very nice email from Markus that gives numbers does not
> >> provide any information on where the numbers come from.
> >
> > I just ran a simple Java program based on Wikidata Toolkit to count the
> > date values. The features I used for counting are all part of the data
> > (concretely I accessed: year number, precision, and calendar model). I
> > used the JSON dump of 22 June 2015. The program counted all dates that
> > occur in any place (main values of statements, qualifiers, and
> > references). No other special processing was done.
> >
> > Below is the main code snippet that did the counting, in case my
> > description was too vague. If you want to get your own numbers, it does
> > not require much (I just modified one of the example programs in 
> Wikidata
> > Toolkit that gathers general statistics). Running the code took about
> > 25min on my laptop (the initial dump download took longer though). The
> > SPARQL endpoint at https://wdqs-beta.wmflabs.org/ should also return
> > useful counts if it does not time out on the very large numbers. It uses
> > life data.
> >
> > Best regards,
> >
> > Markus
> >
> >
> > // after determining that snak is of appropriate type: String cm =
> > ((TimeValue) ((ValueSnak) snak).getValue())
> > .getPreferredCalendarModel(); if (TimeValue.CM_GREGORIAN_PRO.equals(cm))
> > { this.countGregDates++; } else if (TimeValue.CM_JULIAN_PRO.equals(cm))
> > { this.countJulDates++; } else { System.err.println("Weird calendar
> > model: " + ((ValueSnak) snak).getValue()); }
> >
> > if (((TimeValue) ((ValueSnak) snak).getValue()).getPrecision() <=
> > TimeValue.PREC_MONTH) { return; }
> >
> > long year = ((TimeValue) ((ValueSnak) snak).getValue()).getYear(); if
> > (year >= 1923) { this.countModernDates++; } else if (year >= 1753) {
> > this.countAlmostModernDates++; } else if (year >= 1582) {
> > this.countTransitionDates++; } else { this.countOldenDates++; }
> >
> >
> > ___ Wikidata mailing list
> > Wikidata@lists.wikimedia.org 
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> 
> 
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> 

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Markus Krötzsch

On 01.07.2015 20:08, John Erling Blad wrote:

Wouldn't it be better to use iso8601 as internal format?


Yes, that was essentially our original proposal. ISO8601 is a syntax for 
proleptic Gregorian dates, so this would be the internal calendar model. 
ISO has no such detailed way to specify precision, so this was an add-on 
we conceived for Wikidata  as an optional annotation to the exact date). 
The idea was that all type "date" just means "ISO date + precision" and 
that Julian calendar support is provided by offering transparent 
conversion functions to the user (so you could always see and write 
Julian dates if you wanted to, without this changing the internal ISO 
form). The additions of "before" and "after" came later. The key idea to 
use ISO format internally while still supporting Julian dates with 
perfect round-tripping is implemented in Semantic MediaWiki, and the 
plan was to do this in Wikidata as well.


That was the theory. In practice, there was the confusion that Lydia 
described. Maybe the main problem was that bot authors were writing the 
internal data directly (for a while this was almost unfiltered). So when 
they would use the API like a user would use the UI ("set 'Julian' if 
you want to input your date in Julian"), then it would be wrong, since 
they would bypass the Julian conversion that the UI provided. This might 
have been the seed for the confusion that arose. This is only of 
historic interest now; we don't need to discuss where exactly the errors 
happened first. Better focus on fixing the dates now.


Best regards,

Markus




ons. 1. jul. 2015, 18.45 skrev Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>:

On 01.07.2015 18:14, Peter F. Patel-Schneider wrote:
 > On 07/01/2015 07:00 AM, Pierpaolo Bernardi wrote:
 >> On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
 >> mailto:mar...@semantic-mediawiki.org>> wrote:
 >>> Dear Pierpaolo,
 >>>
 >>> This thread was only about Julian and Gregorian calendar dates.
If and
 >>> how other calendar models should be supported in some future is
 >>> another (potentially big) discussion. As you said, there are many
 >>> issues there. Let's first make sure that we handle the "easy"
99.9% of
 >>> cases correctly before discussing any more complicated options.
 >>
 >> Lydia Pintscher in the starting email explained that there's a
model for
 >> calendars, and unfortunately this model could be (and has been)
 >> interpreted in two ways (AFAIU).
 >>
 >> My intention was to point out that one of the two
interpretations is not
 >> sound.  This leaves the other one as the only viable one.
 >>
 >> Cheers P.
 >
 > It appears (from the email only---there are no pointers to enduring
 > documentation on the solution that are attached to the relevant
classes or
 > poperties) that the chosen method is to store dates in both the
source
 > calendar and the proleptic Gegorian calendar
 >

(https://www.wikidata.org/wiki/Wikidata:Project_chat#calendar_model_screwup).
 > As you point out, this is not a viable solution for calendars
whose days do
 > not start at the same time as days in the proleptic Gegorian calendar
 > (unless, of course, there is time and location information also
available).

The Wikidata date implementation intentionally restricts to dates that
are compatible with the Gregorian calendar. Although the system refers
to Wikidata item ids of calendar models to denote "Proleptic Gregorian"
and "Proleptic Julian", the system does not allow users or bots to enter
arbitrary items as calendar model.

My understanding (and the implementation in WDTK) is that all dates are
provided in Gregorian calendar with a calendar model that specifies how
they should be displayed (if possible). The date in the source calendar
is for convenience and maybe for technical reasons on the side of the
PHP implementation. At no time should the source calendar date be
impossible to convert to Gregorian. We have had extensive discussions
about this point -- Gregorian must remain the main format at all times.

This does not mean that we cannot have more models in the future. There
is (currently unused) timezone information, which can be used to store
offsets. Once fully implemented, this might allow exact conversion from
calendar models that have another start for their days. So maybe this is
not a case of real incompatibility. However, the timezone support for
current dates needs to be finished before discussing the next steps into
more exotic calendars.

Best regards,

Markus


___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata



__

Re: [Wikidata] calendar model screwup

2015-07-01 Thread John Erling Blad
That should be "default calendar model".
My screw up... ;/

ons. 1. jul. 2015, 20.08 skrev John Erling Blad :

> Wouldn't it be better to use iso8601 as internal format?
>
> ons. 1. jul. 2015, 18.45 skrev Markus Krötzsch <
> mar...@semantic-mediawiki.org>:
>
>> On 01.07.2015 18:14, Peter F. Patel-Schneider wrote:
>> > On 07/01/2015 07:00 AM, Pierpaolo Bernardi wrote:
>> >> On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
>> >>  wrote:
>> >>> Dear Pierpaolo,
>> >>>
>> >>> This thread was only about Julian and Gregorian calendar dates. If and
>> >>> how other calendar models should be supported in some future is
>> >>> another (potentially big) discussion. As you said, there are many
>> >>> issues there. Let's first make sure that we handle the "easy" 99.9% of
>> >>> cases correctly before discussing any more complicated options.
>> >>
>> >> Lydia Pintscher in the starting email explained that there's a model
>> for
>> >> calendars, and unfortunately this model could be (and has been)
>> >> interpreted in two ways (AFAIU).
>> >>
>> >> My intention was to point out that one of the two interpretations is
>> not
>> >> sound.  This leaves the other one as the only viable one.
>> >>
>> >> Cheers P.
>> >
>> > It appears (from the email only---there are no pointers to enduring
>> > documentation on the solution that are attached to the relevant classes
>> or
>> > poperties) that the chosen method is to store dates in both the source
>> > calendar and the proleptic Gegorian calendar
>> > (
>> https://www.wikidata.org/wiki/Wikidata:Project_chat#calendar_model_screwup
>> ).
>> > As you point out, this is not a viable solution for calendars whose
>> days do
>> > not start at the same time as days in the proleptic Gegorian calendar
>> > (unless, of course, there is time and location information also
>> available).
>>
>> The Wikidata date implementation intentionally restricts to dates that
>> are compatible with the Gregorian calendar. Although the system refers
>> to Wikidata item ids of calendar models to denote "Proleptic Gregorian"
>> and "Proleptic Julian", the system does not allow users or bots to enter
>> arbitrary items as calendar model.
>>
>> My understanding (and the implementation in WDTK) is that all dates are
>> provided in Gregorian calendar with a calendar model that specifies how
>> they should be displayed (if possible). The date in the source calendar
>> is for convenience and maybe for technical reasons on the side of the
>> PHP implementation. At no time should the source calendar date be
>> impossible to convert to Gregorian. We have had extensive discussions
>> about this point -- Gregorian must remain the main format at all times.
>>
>> This does not mean that we cannot have more models in the future. There
>> is (currently unused) timezone information, which can be used to store
>> offsets. Once fully implemented, this might allow exact conversion from
>> calendar models that have another start for their days. So maybe this is
>> not a case of real incompatibility. However, the timezone support for
>> current dates needs to be finished before discussing the next steps into
>> more exotic calendars.
>>
>> Best regards,
>>
>> Markus
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Goal: Establish a framework to engage with data engineers and open data organizations

2015-07-01 Thread Info WorldUniversity
Hi Quim, Sylvia, Lydia and Wikidatans,

In terms of strategic partnerships with CC Wikidata, and this Wikidata
Engineering
Community project looking for -

 "* people that have been in touch with organizations willing to contribute
their open data" -

in what ways could CC World University and School contribute our early CC
Nation State Universities, e.g. Mexico World University and School -
http://worlduniversity.wikia.com/wiki/Mexico - (with only the CC MIT OCW in
Spanish so far, and not yet in Spanish otherwise)? See also, for example,
the Nation States' wiki subject page at WUaS -
http://worlduniversity.wikia.com/wiki/Nation_States. Each wiki Nation-State
University at CC WUaS, seeks to accredit in each country in their main
languags to offer online CC baccalaureate, Ph.D., Law and M.D. degrees, and
also I.B. high school degrees, if possible, based on CC MIT OCW in 7
languages and CC Yale OYC, to begin.

In addition, CC World University would like to facilitate wiki / Wikidata
schools for open teaching and learning in each of all 7,929+ languages. As
planned interlingual wiki schools for highest quality universal education,
CC World University and School would like to explore developing in
Wikidata, and become a growth story also for Wikidata. World University
would thus like to explore contributing our open data in these regards.

In what ways might WUaS best contribute in terms of a strategic
partnerships re this Wikidata Engineering Community project ? Perhaps World
University could become a part of the"community framework allowing Wikidata
content and tech contributors, data engineers, and open data organizations
to collaborate effectively." Thank you.

Best,
Info (Scott)



On Wed, Jul 1, 2015 at 9:48 AM, Benjamin Good 
wrote:

> Quim,
>
> I'm not familiar with GLAM or what you are really asking for here.  Could
> you elaborate a little?  Our group is actively engaged in writing bots for
> populating wikidata with trusted biomedical information and for using that
> information to drive applications such as Wikipedia.  Processes for making
> this easier would be most welcome.  A lot of what we are doing and hoping
> to do is described on this bot page:
> https://www.wikidata.org/wiki/User:ProteinBoxBot
>
> ?
> -Ben
>
>
>
> On Wed, Jul 1, 2015 at 9:23 AM, David Cuenca Tudela 
> wrote:
>
>> Hello Quim,
>>
>> There was always the issue of where to publish datasets from partner
>> organisations like a http://datahub.io/
>>
>> Is that being considered in this new iteration?
>>
>> Cheers,
>> Micru
>>
>> On Wed, Jul 1, 2015 at 6:19 PM, Romaine Wiki 
>> wrote:
>>
>>> Hello Quim,
>>>
>>> We have in Belgium (as Wikimedia Belgium) a partner organisation who is
>>> together with us working with cultural institutions to get open datasets to
>>> be used in Wikidata.
>>>
>>> So yes, we are interested.
>>>
>>> Greetings,
>>> Romaine
>>>
>>> 2015-07-01 17:31 GMT+02:00 Quim Gil :
>>>
 Hi, it's first of July and I would like to introduce you a quarterly
 goal that the Engineering Community team has committed to:

 Establish a framework to engage with data engineers and open data
 organizations
 https://phabricator.wikimedia.org/T101950

 We are missing a community framework allowing Wikidata content and tech
 contributors, data engineers, and open data organizations to collaborate
 effectively. Imagine GLAM applied to data.

 If all goes well, by the end of September we would like to have basic
 documentation and community processes for open data engineers and
 organizations willing to contribute to Wikidata, and ongoing projects
 with one open data org.

 If you are interested, get involved! We are looking for

 * Wikidata contributors with good institutional memory
 * people that has been in touch with organizations willing to
 contribute their open data
 * developers willing to help improving our software and programming
 missing pieces
 * also contributors familiar with the GLAM model(s), what works and
 what didn't work

 This goal has been created after some conversations with Lydia
 Pintscher (Wikidata team) and Sylvia Ventura (Strategic Partnerships). Both
 are on board, Lydia assuring that this work fits into what is technically
 effective, and Sylvia checking our work against real open data
 organizations willing to get involved.

 This email effectively starts the bootstrapping of this project. I will
 start creating subtasks under that goal based on your feedback and common
 sense.

 --
 Quim Gil
 Engineering Community Manager @ Wikimedia Foundation
 http://www.mediawiki.org/wiki/User:Qgil

 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata


>>>
>>> ___
>>> Wikidata mailin

Re: [Wikidata] calendar model screwup

2015-07-01 Thread John Erling Blad
Wouldn't it be better to use iso8601 as internal format?

ons. 1. jul. 2015, 18.45 skrev Markus Krötzsch <
mar...@semantic-mediawiki.org>:

> On 01.07.2015 18:14, Peter F. Patel-Schneider wrote:
> > On 07/01/2015 07:00 AM, Pierpaolo Bernardi wrote:
> >> On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
> >>  wrote:
> >>> Dear Pierpaolo,
> >>>
> >>> This thread was only about Julian and Gregorian calendar dates. If and
> >>> how other calendar models should be supported in some future is
> >>> another (potentially big) discussion. As you said, there are many
> >>> issues there. Let's first make sure that we handle the "easy" 99.9% of
> >>> cases correctly before discussing any more complicated options.
> >>
> >> Lydia Pintscher in the starting email explained that there's a model for
> >> calendars, and unfortunately this model could be (and has been)
> >> interpreted in two ways (AFAIU).
> >>
> >> My intention was to point out that one of the two interpretations is not
> >> sound.  This leaves the other one as the only viable one.
> >>
> >> Cheers P.
> >
> > It appears (from the email only---there are no pointers to enduring
> > documentation on the solution that are attached to the relevant classes
> or
> > poperties) that the chosen method is to store dates in both the source
> > calendar and the proleptic Gegorian calendar
> > (
> https://www.wikidata.org/wiki/Wikidata:Project_chat#calendar_model_screwup
> ).
> > As you point out, this is not a viable solution for calendars whose days
> do
> > not start at the same time as days in the proleptic Gegorian calendar
> > (unless, of course, there is time and location information also
> available).
>
> The Wikidata date implementation intentionally restricts to dates that
> are compatible with the Gregorian calendar. Although the system refers
> to Wikidata item ids of calendar models to denote "Proleptic Gregorian"
> and "Proleptic Julian", the system does not allow users or bots to enter
> arbitrary items as calendar model.
>
> My understanding (and the implementation in WDTK) is that all dates are
> provided in Gregorian calendar with a calendar model that specifies how
> they should be displayed (if possible). The date in the source calendar
> is for convenience and maybe for technical reasons on the side of the
> PHP implementation. At no time should the source calendar date be
> impossible to convert to Gregorian. We have had extensive discussions
> about this point -- Gregorian must remain the main format at all times.
>
> This does not mean that we cannot have more models in the future. There
> is (currently unused) timezone information, which can be used to store
> offsets. Once fully implemented, this might allow exact conversion from
> calendar models that have another start for their days. So maybe this is
> not a case of real incompatibility. However, the timezone support for
> current dates needs to be finished before discussing the next steps into
> more exotic calendars.
>
> Best regards,
>
> Markus
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Denny Vrandečić
Peter,

you might be looking for this:

https://www.mediawiki.org/wiki/Wikibase/DataModel#Dates_and_times

Cheers,
Denny

On Wed, Jul 1, 2015 at 9:48 AM Peter F. Patel-Schneider <
pfpschnei...@gmail.com> wrote:

> Thanks.
>
> This helps in finding out how to reproduce the numbers.
>
> However, I'm still confused as to how these bits of data are part of the
> Wikidata data/knowledge model.  Where is the description of
> getPreferredCalendarModel, for example?
>
>
> http://javadox.com/org.wikidata.wdtk/wdtk-datamodel/0.1.0/org/wikidata/wdtk/datamodel/interfaces/TimeValue.html
>  Is a *partial* description of what is going on.  Changes to this document
> would be somewhat useful.  However, what I'm really looking for is a
> description of how time works in Wikidata.
>
> peter
>
> PS:  I note that there are lots of aspects of TimeValue that are only
> suitable for the Gregorian and Julian calendars.
>
>
>
> On 07/01/2015 09:24 AM, Markus Krötzsch wrote:
> > On 01.07.2015 18:03, Peter F. Patel-Schneider wrote: ...
> >>
> >> Even the very nice email from Markus that gives numbers does not
> >> provide any information on where the numbers come from.
> >
> > I just ran a simple Java program based on Wikidata Toolkit to count the
> > date values. The features I used for counting are all part of the data
> > (concretely I accessed: year number, precision, and calendar model). I
> > used the JSON dump of 22 June 2015. The program counted all dates that
> > occur in any place (main values of statements, qualifiers, and
> > references). No other special processing was done.
> >
> > Below is the main code snippet that did the counting, in case my
> > description was too vague. If you want to get your own numbers, it does
> > not require much (I just modified one of the example programs in Wikidata
> > Toolkit that gathers general statistics). Running the code took about
> > 25min on my laptop (the initial dump download took longer though). The
> > SPARQL endpoint at https://wdqs-beta.wmflabs.org/ should also return
> > useful counts if it does not time out on the very large numbers. It uses
> > life data.
> >
> > Best regards,
> >
> > Markus
> >
> >
> > // after determining that snak is of appropriate type: String cm =
> > ((TimeValue) ((ValueSnak) snak).getValue())
> > .getPreferredCalendarModel(); if (TimeValue.CM_GREGORIAN_PRO.equals(cm))
> > { this.countGregDates++; } else if (TimeValue.CM_JULIAN_PRO.equals(cm))
> > { this.countJulDates++; } else { System.err.println("Weird calendar
> > model: " + ((ValueSnak) snak).getValue()); }
> >
> > if (((TimeValue) ((ValueSnak) snak).getValue()).getPrecision() <=
> > TimeValue.PREC_MONTH) { return; }
> >
> > long year = ((TimeValue) ((ValueSnak) snak).getValue()).getYear(); if
> > (year >= 1923) { this.countModernDates++; } else if (year >= 1753) {
> > this.countAlmostModernDates++; } else if (year >= 1582) {
> > this.countTransitionDates++; } else { this.countOldenDates++; }
> >
> >
> > ___ Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Goal: Establish a framework to engage with data engineers and open data organizations

2015-07-01 Thread Benjamin Good
Quim,

I'm not familiar with GLAM or what you are really asking for here.  Could
you elaborate a little?  Our group is actively engaged in writing bots for
populating wikidata with trusted biomedical information and for using that
information to drive applications such as Wikipedia.  Processes for making
this easier would be most welcome.  A lot of what we are doing and hoping
to do is described on this bot page:
https://www.wikidata.org/wiki/User:ProteinBoxBot

?
-Ben



On Wed, Jul 1, 2015 at 9:23 AM, David Cuenca Tudela 
wrote:

> Hello Quim,
>
> There was always the issue of where to publish datasets from partner
> organisations like a http://datahub.io/
>
> Is that being considered in this new iteration?
>
> Cheers,
> Micru
>
> On Wed, Jul 1, 2015 at 6:19 PM, Romaine Wiki 
> wrote:
>
>> Hello Quim,
>>
>> We have in Belgium (as Wikimedia Belgium) a partner organisation who is
>> together with us working with cultural institutions to get open datasets to
>> be used in Wikidata.
>>
>> So yes, we are interested.
>>
>> Greetings,
>> Romaine
>>
>> 2015-07-01 17:31 GMT+02:00 Quim Gil :
>>
>>> Hi, it's first of July and I would like to introduce you a quarterly
>>> goal that the Engineering Community team has committed to:
>>>
>>> Establish a framework to engage with data engineers and open data
>>> organizations
>>> https://phabricator.wikimedia.org/T101950
>>>
>>> We are missing a community framework allowing Wikidata content and tech
>>> contributors, data engineers, and open data organizations to collaborate
>>> effectively. Imagine GLAM applied to data.
>>>
>>> If all goes well, by the end of September we would like to have basic
>>> documentation and community processes for open data engineers and
>>> organizations willing to contribute to Wikidata, and ongoing projects
>>> with one open data org.
>>>
>>> If you are interested, get involved! We are looking for
>>>
>>> * Wikidata contributors with good institutional memory
>>> * people that has been in touch with organizations willing to contribute
>>> their open data
>>> * developers willing to help improving our software and programming
>>> missing pieces
>>> * also contributors familiar with the GLAM model(s), what works and what
>>> didn't work
>>>
>>> This goal has been created after some conversations with Lydia Pintscher
>>> (Wikidata team) and Sylvia Ventura (Strategic Partnerships). Both are on
>>> board, Lydia assuring that this work fits into what is technically
>>> effective, and Sylvia checking our work against real open data
>>> organizations willing to get involved.
>>>
>>> This email effectively starts the bootstrapping of this project. I will
>>> start creating subtasks under that goal based on your feedback and common
>>> sense.
>>>
>>> --
>>> Quim Gil
>>> Engineering Community Manager @ Wikimedia Foundation
>>> http://www.mediawiki.org/wiki/User:Qgil
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
>
> --
> Etiamsi omnes, ego non
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Peter F. Patel-Schneider
Thanks.

This helps in finding out how to reproduce the numbers.

However, I'm still confused as to how these bits of data are part of the
Wikidata data/knowledge model.  Where is the description of
getPreferredCalendarModel, for example?

http://javadox.com/org.wikidata.wdtk/wdtk-datamodel/0.1.0/org/wikidata/wdtk/datamodel/interfaces/TimeValue.html
 Is a *partial* description of what is going on.  Changes to this document
would be somewhat useful.  However, what I'm really looking for is a
description of how time works in Wikidata.

peter

PS:  I note that there are lots of aspects of TimeValue that are only
suitable for the Gregorian and Julian calendars.



On 07/01/2015 09:24 AM, Markus Krötzsch wrote:
> On 01.07.2015 18:03, Peter F. Patel-Schneider wrote: ...
>> 
>> Even the very nice email from Markus that gives numbers does not
>> provide any information on where the numbers come from.
> 
> I just ran a simple Java program based on Wikidata Toolkit to count the
> date values. The features I used for counting are all part of the data
> (concretely I accessed: year number, precision, and calendar model). I
> used the JSON dump of 22 June 2015. The program counted all dates that
> occur in any place (main values of statements, qualifiers, and
> references). No other special processing was done.
> 
> Below is the main code snippet that did the counting, in case my
> description was too vague. If you want to get your own numbers, it does
> not require much (I just modified one of the example programs in Wikidata
> Toolkit that gathers general statistics). Running the code took about
> 25min on my laptop (the initial dump download took longer though). The
> SPARQL endpoint at https://wdqs-beta.wmflabs.org/ should also return
> useful counts if it does not time out on the very large numbers. It uses
> life data.
> 
> Best regards,
> 
> Markus
> 
> 
> // after determining that snak is of appropriate type: String cm =
> ((TimeValue) ((ValueSnak) snak).getValue()) 
> .getPreferredCalendarModel(); if (TimeValue.CM_GREGORIAN_PRO.equals(cm))
> { this.countGregDates++; } else if (TimeValue.CM_JULIAN_PRO.equals(cm))
> { this.countJulDates++; } else { System.err.println("Weird calendar
> model: " + ((ValueSnak) snak).getValue()); }
> 
> if (((TimeValue) ((ValueSnak) snak).getValue()).getPrecision() <= 
> TimeValue.PREC_MONTH) { return; }
> 
> long year = ((TimeValue) ((ValueSnak) snak).getValue()).getYear(); if
> (year >= 1923) { this.countModernDates++; } else if (year >= 1753) { 
> this.countAlmostModernDates++; } else if (year >= 1582) { 
> this.countTransitionDates++; } else { this.countOldenDates++; }
> 
> 
> ___ Wikidata mailing list 
> Wikidata@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/wikidata

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Markus Krötzsch

On 01.07.2015 18:14, Peter F. Patel-Schneider wrote:

On 07/01/2015 07:00 AM, Pierpaolo Bernardi wrote:

On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
 wrote:

Dear Pierpaolo,

This thread was only about Julian and Gregorian calendar dates. If and
how other calendar models should be supported in some future is
another (potentially big) discussion. As you said, there are many
issues there. Let's first make sure that we handle the "easy" 99.9% of
cases correctly before discussing any more complicated options.


Lydia Pintscher in the starting email explained that there's a model for
calendars, and unfortunately this model could be (and has been)
interpreted in two ways (AFAIU).

My intention was to point out that one of the two interpretations is not
sound.  This leaves the other one as the only viable one.

Cheers P.


It appears (from the email only---there are no pointers to enduring
documentation on the solution that are attached to the relevant classes or
poperties) that the chosen method is to store dates in both the source
calendar and the proleptic Gegorian calendar
(https://www.wikidata.org/wiki/Wikidata:Project_chat#calendar_model_screwup).
As you point out, this is not a viable solution for calendars whose days do
not start at the same time as days in the proleptic Gegorian calendar
(unless, of course, there is time and location information also available).


The Wikidata date implementation intentionally restricts to dates that 
are compatible with the Gregorian calendar. Although the system refers 
to Wikidata item ids of calendar models to denote "Proleptic Gregorian" 
and "Proleptic Julian", the system does not allow users or bots to enter 
arbitrary items as calendar model.


My understanding (and the implementation in WDTK) is that all dates are 
provided in Gregorian calendar with a calendar model that specifies how 
they should be displayed (if possible). The date in the source calendar 
is for convenience and maybe for technical reasons on the side of the 
PHP implementation. At no time should the source calendar date be 
impossible to convert to Gregorian. We have had extensive discussions 
about this point -- Gregorian must remain the main format at all times.


This does not mean that we cannot have more models in the future. There 
is (currently unused) timezone information, which can be used to store 
offsets. Once fully implemented, this might allow exact conversion from 
calendar models that have another start for their days. So maybe this is 
not a case of real incompatibility. However, the timezone support for 
current dates needs to be finished before discussing the next steps into 
more exotic calendars.


Best regards,

Markus


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Peter F. Patel-Schneider
On 07/01/2015 07:00 AM, Pierpaolo Bernardi wrote:
> On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch 
>  wrote:
>> Dear Pierpaolo,
>> 
>> This thread was only about Julian and Gregorian calendar dates. If and
>> how other calendar models should be supported in some future is
>> another (potentially big) discussion. As you said, there are many
>> issues there. Let's first make sure that we handle the "easy" 99.9% of
>> cases correctly before discussing any more complicated options.
> 
> Lydia Pintscher in the starting email explained that there's a model for
> calendars, and unfortunately this model could be (and has been) 
> interpreted in two ways (AFAIU).
> 
> My intention was to point out that one of the two interpretations is not
> sound.  This leaves the other one as the only viable one.
> 
> Cheers P.

It appears (from the email only---there are no pointers to enduring
documentation on the solution that are attached to the relevant classes or
poperties) that the chosen method is to store dates in both the source
calendar and the proleptic Gegorian calendar
(https://www.wikidata.org/wiki/Wikidata:Project_chat#calendar_model_screwup).
As you point out, this is not a viable solution for calendars whose days do
not start at the same time as days in the proleptic Gegorian calendar
(unless, of course, there is time and location information also available).

peter

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Markus Krötzsch

On 01.07.2015 18:03, Peter F. Patel-Schneider wrote:
...


Even the very nice email from Markus that gives numbers does not provide any
information on where the numbers come from.


I just ran a simple Java program based on Wikidata Toolkit to count the 
date values. The features I used for counting are all part of the data 
(concretely I accessed: year number, precision, and calendar model). I 
used the JSON dump of 22 June 2015. The program counted all dates that 
occur in any place (main values of statements, qualifiers, and 
references). No other special processing was done.


Below is the main code snippet that did the counting, in case my 
description was too vague. If you want to get your own numbers, it does 
not require much (I just modified one of the example programs in 
Wikidata Toolkit that gathers general statistics). Running the code took 
about 25min on my laptop (the initial dump download took longer though). 
The SPARQL endpoint at https://wdqs-beta.wmflabs.org/ should also return 
useful counts if it does not time out on the very large numbers. It uses 
life data.


Best regards,

Markus


// after determining that snak is of appropriate type:
String cm = ((TimeValue) ((ValueSnak) snak).getValue())
.getPreferredCalendarModel();
if (TimeValue.CM_GREGORIAN_PRO.equals(cm)) {
this.countGregDates++;
} else if (TimeValue.CM_JULIAN_PRO.equals(cm)) {
this.countJulDates++;
} else {
System.err.println("Weird calendar model: "
+ ((ValueSnak) snak).getValue());
}

if (((TimeValue) ((ValueSnak) snak).getValue()).getPrecision() <= 
TimeValue.PREC_MONTH) {

return;
}

long year = ((TimeValue) ((ValueSnak) snak).getValue()).getYear();
if (year >= 1923) {
this.countModernDates++;
} else if (year >= 1753) {
this.countAlmostModernDates++;
} else if (year >= 1582) {
this.countTransitionDates++;
} else {
this.countOldenDates++;
}


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Goal: Establish a framework to engage with data engineers and open data organizations

2015-07-01 Thread David Cuenca Tudela
Hello Quim,

There was always the issue of where to publish datasets from partner
organisations like a http://datahub.io/

Is that being considered in this new iteration?

Cheers,
Micru

On Wed, Jul 1, 2015 at 6:19 PM, Romaine Wiki  wrote:

> Hello Quim,
>
> We have in Belgium (as Wikimedia Belgium) a partner organisation who is
> together with us working with cultural institutions to get open datasets to
> be used in Wikidata.
>
> So yes, we are interested.
>
> Greetings,
> Romaine
>
> 2015-07-01 17:31 GMT+02:00 Quim Gil :
>
>> Hi, it's first of July and I would like to introduce you a quarterly goal
>> that the Engineering Community team has committed to:
>>
>> Establish a framework to engage with data engineers and open data
>> organizations
>> https://phabricator.wikimedia.org/T101950
>>
>> We are missing a community framework allowing Wikidata content and tech
>> contributors, data engineers, and open data organizations to collaborate
>> effectively. Imagine GLAM applied to data.
>>
>> If all goes well, by the end of September we would like to have basic
>> documentation and community processes for open data engineers and
>> organizations willing to contribute to Wikidata, and ongoing projects
>> with one open data org.
>>
>> If you are interested, get involved! We are looking for
>>
>> * Wikidata contributors with good institutional memory
>> * people that has been in touch with organizations willing to contribute
>> their open data
>> * developers willing to help improving our software and programming
>> missing pieces
>> * also contributors familiar with the GLAM model(s), what works and what
>> didn't work
>>
>> This goal has been created after some conversations with Lydia Pintscher
>> (Wikidata team) and Sylvia Ventura (Strategic Partnerships). Both are on
>> board, Lydia assuring that this work fits into what is technically
>> effective, and Sylvia checking our work against real open data
>> organizations willing to get involved.
>>
>> This email effectively starts the bootstrapping of this project. I will
>> start creating subtasks under that goal based on your feedback and common
>> sense.
>>
>> --
>> Quim Gil
>> Engineering Community Manager @ Wikimedia Foundation
>> http://www.mediawiki.org/wiki/User:Qgil
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>


-- 
Etiamsi omnes, ego non
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Goal: Establish a framework to engage with data engineers and open data organizations

2015-07-01 Thread Romaine Wiki
Hello Quim,

We have in Belgium (as Wikimedia Belgium) a partner organisation who is
together with us working with cultural institutions to get open datasets to
be used in Wikidata.

So yes, we are interested.

Greetings,
Romaine

2015-07-01 17:31 GMT+02:00 Quim Gil :

> Hi, it's first of July and I would like to introduce you a quarterly goal
> that the Engineering Community team has committed to:
>
> Establish a framework to engage with data engineers and open data
> organizations
> https://phabricator.wikimedia.org/T101950
>
> We are missing a community framework allowing Wikidata content and tech
> contributors, data engineers, and open data organizations to collaborate
> effectively. Imagine GLAM applied to data.
>
> If all goes well, by the end of September we would like to have basic
> documentation and community processes for open data engineers and
> organizations willing to contribute to Wikidata, and ongoing projects
> with one open data org.
>
> If you are interested, get involved! We are looking for
>
> * Wikidata contributors with good institutional memory
> * people that has been in touch with organizations willing to contribute
> their open data
> * developers willing to help improving our software and programming
> missing pieces
> * also contributors familiar with the GLAM model(s), what works and what
> didn't work
>
> This goal has been created after some conversations with Lydia Pintscher
> (Wikidata team) and Sylvia Ventura (Strategic Partnerships). Both are on
> board, Lydia assuring that this work fits into what is technically
> effective, and Sylvia checking our work against real open data
> organizations willing to get involved.
>
> This email effectively starts the bootstrapping of this project. I will
> start creating subtasks under that goal based on your feedback and common
> sense.
>
> --
> Quim Gil
> Engineering Community Manager @ Wikimedia Foundation
> http://www.mediawiki.org/wiki/User:Qgil
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread John Erling Blad
Open a new thread for discussion of calendar models in general.

On Wed, Jul 1, 2015 at 4:49 PM, Markus Krötzsch
 wrote:
> On 01.07.2015 16:00, Pierpaolo Bernardi wrote:
>>
>> On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
>>  wrote:
>>>
>>> Dear Pierpaolo,
>>>
>>> This thread was only about Julian and Gregorian calendar dates. If and
>>> how
>>> other calendar models should be supported in some future is another
>>> (potentially big) discussion. As you said, there are many issues there.
>>> Let's first make sure that we handle the "easy" 99.9% of cases correctly
>>> before discussing any more complicated options.
>>
>>
>> Lydia Pintscher in the starting email explained that there's a model
>> for calendars, and unfortunately this model could be (and has been)
>> interpreted in two ways (AFAIU).
>>
>> My intention was to point out that one of the two interpretations is
>> not sound.  This leaves the other one as the only viable one.
>
>
> To clarify: the problem that Lydia discussed has occurred on another (more
> technical) level. It is not about the question whether there are further
> calendar models that are incompatible to Julian and Gregorian, but about the
> two calendar models that are captured by what Wikidata calls the "date"
> type. This type does not support dates that cannot be converted into one
> another. This is the usual trade-off you have when building a data-based
> system: you have to restrict the possible formats to ensure that the
> resulting data is still usable. For example, we could capture many more
> complex things and nuances of reality in free text, but then we would not
> have Wikidata but Wikipedia ;-)
>
> What is colloquially called a calendar date can be anywhere between clearly
> defined time point to a rough suggestion of a relative time frame. Wikidata
> already makes a lot of commitments towards a less strict notion of "date",
> many of which are not fully supported and correctly used now (timezones,
> "before" and "after" -- even the meaning of "precision" is all but clear).
> Many of these features have been implemented as a response to user queries
> for making date entry even more general, to cover even more corner cases.
> For data consumers, this makes the data much harder to use. It creates a
> cost for everyone. So far, there is only the cost, and not the benefit (or
> is anybody using "before" and "after"? Yet I have to deal with it when
> reading data!). Let's first make use of what we have (this includes proper
> UI support for timezone annotation and precision windows), before discussing
> even more complex notions of calendar and time.
>
> But don't worry: there will surely be more calendar models that can be
> supported properly, in a specified and clear way. However, it is definitely
> not planned that all possible calendar models will at some time be
> implemented. A basic design goal of the "date" type in Wikidata is that
> dates remain compatible on the day level. Calendars that are too far away
> from this should use own properties (maybe of type string, maybe of another
> special date type). One can then give approximate Gregorian/Julian dates in
> addition by using the standard date properties of Wikidata (these
> approximate dates would then not capture the exact moment, but the best
> possible approximation). In this way, one can get the best of both worlds:
> exact date information in native calendar models and maximal compatibility
> with major time-based applications (such as Histropedia) and query services
> (all time-related query functions in SPARQL databases are based on Gregorian
> dates).
>
> Regards,
>
> Markus
>
>
>>
>> Cheers
>> P.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Peter F. Patel-Schneider
I would find this discussion easier to follow if the Wikidata identifiers
for the various classes and properties were mentioned, and there were
pointers to relevant documentation.

The only  Wikidata class or property that I could easily find is Q205892.
It's discussion page, https://www.wikidata.org/wiki/Talk:Q205892, mentions a
bit about conversion, but nothing about this issue.

The page segment that is supposed to be being used for discussion,
https://www.wikidata.org/wiki/Wikidata:Project_chat#calendar_model_screwup,
does not have any pointers to any classes, properties, or documentation.

Even the very nice email from Markus that gives numbers does not provide any
information on where the numbers come from.

Please

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Goal: Establish a framework to engage with data engineers and open data organizations

2015-07-01 Thread Quim Gil
Hi, it's first of July and I would like to introduce you a quarterly goal
that the Engineering Community team has committed to:

Establish a framework to engage with data engineers and open data
organizations
https://phabricator.wikimedia.org/T101950

We are missing a community framework allowing Wikidata content and tech
contributors, data engineers, and open data organizations to collaborate
effectively. Imagine GLAM applied to data.

If all goes well, by the end of September we would like to have basic
documentation and community processes for open data engineers and
organizations willing to contribute to Wikidata, and ongoing projects with
one open data org.

If you are interested, get involved! We are looking for

* Wikidata contributors with good institutional memory
* people that has been in touch with organizations willing to contribute
their open data
* developers willing to help improving our software and programming missing
pieces
* also contributors familiar with the GLAM model(s), what works and what
didn't work

This goal has been created after some conversations with Lydia Pintscher
(Wikidata team) and Sylvia Ventura (Strategic Partnerships). Both are on
board, Lydia assuring that this work fits into what is technically
effective, and Sylvia checking our work against real open data
organizations willing to get involved.

This email effectively starts the bootstrapping of this project. I will
start creating subtasks under that goal based on your feedback and common
sense.

-- 
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [libraries] How can Wikidata get over 30, 000, 000 facts added a year ?

2015-07-01 Thread Luca Martinelli
Il 30/giu/2015 19:54, "Andrea Zanni"  ha scritto:
>
> Hello everyone.
>
> For Italian libraries, you can find a good csv here:
> http://opendata.anagrafe.iccu.sbn.it/territorio.zip
>
> Cristian Consonni also had a project of importing all of them on 
> OpenStreetMap:
> https://wiki.openstreetmap.org/wiki/User:RCantoroBot/Anagrafe_delle_biblioteche_italiane/Import_plan
> (I think the project is frozen though).

I'm also taking care of it (when I'm not under pressure with other
things here at work), through my other account:
https://www.wikidata.org/wiki/User:Sannita_%28ICCU%29

There's a universe of things to make together on Wikidata, and Andrea
barely scratched the surface of what we're trying to do. :)

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Markus Krötzsch

On 01.07.2015 16:00, Pierpaolo Bernardi wrote:

On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
 wrote:

Dear Pierpaolo,

This thread was only about Julian and Gregorian calendar dates. If and how
other calendar models should be supported in some future is another
(potentially big) discussion. As you said, there are many issues there.
Let's first make sure that we handle the "easy" 99.9% of cases correctly
before discussing any more complicated options.


Lydia Pintscher in the starting email explained that there's a model
for calendars, and unfortunately this model could be (and has been)
interpreted in two ways (AFAIU).

My intention was to point out that one of the two interpretations is
not sound.  This leaves the other one as the only viable one.


To clarify: the problem that Lydia discussed has occurred on another 
(more technical) level. It is not about the question whether there are 
further calendar models that are incompatible to Julian and Gregorian, 
but about the two calendar models that are captured by what Wikidata 
calls the "date" type. This type does not support dates that cannot be 
converted into one another. This is the usual trade-off you have when 
building a data-based system: you have to restrict the possible formats 
to ensure that the resulting data is still usable. For example, we could 
capture many more complex things and nuances of reality in free text, 
but then we would not have Wikidata but Wikipedia ;-)


What is colloquially called a calendar date can be anywhere between 
clearly defined time point to a rough suggestion of a relative time 
frame. Wikidata already makes a lot of commitments towards a less strict 
notion of "date", many of which are not fully supported and correctly 
used now (timezones, "before" and "after" -- even the meaning of 
"precision" is all but clear). Many of these features have been 
implemented as a response to user queries for making date entry even 
more general, to cover even more corner cases. For data consumers, this 
makes the data much harder to use. It creates a cost for everyone. So 
far, there is only the cost, and not the benefit (or is anybody using 
"before" and "after"? Yet I have to deal with it when reading data!). 
Let's first make use of what we have (this includes proper UI support 
for timezone annotation and precision windows), before discussing even 
more complex notions of calendar and time.


But don't worry: there will surely be more calendar models that can be 
supported properly, in a specified and clear way. However, it is 
definitely not planned that all possible calendar models will at some 
time be implemented. A basic design goal of the "date" type in Wikidata 
is that dates remain compatible on the day level. Calendars that are too 
far away from this should use own properties (maybe of type string, 
maybe of another special date type). One can then give approximate 
Gregorian/Julian dates in addition by using the standard date properties 
of Wikidata (these approximate dates would then not capture the exact 
moment, but the best possible approximation). In this way, one can get 
the best of both worlds: exact date information in native calendar 
models and maximal compatibility with major time-based applications 
(such as Histropedia) and query services (all time-related query 
functions in SPARQL databases are based on Gregorian dates).


Regards,

Markus



Cheers
P.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-01 Thread Pierpaolo Bernardi
On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
 wrote:
> Dear Pierpaolo,
>
> This thread was only about Julian and Gregorian calendar dates. If and how
> other calendar models should be supported in some future is another
> (potentially big) discussion. As you said, there are many issues there.
> Let's first make sure that we handle the "easy" 99.9% of cases correctly
> before discussing any more complicated options.

Lydia Pintscher in the starting email explained that there's a model
for calendars, and unfortunately this model could be (and has been)
interpreted in two ways (AFAIU).

My intention was to point out that one of the two interpretations is
not sound.  This leaves the other one as the only viable one.

Cheers
P.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata