Re: [Wikidata-tech] API JSON format for warnings

2015-09-01 Thread Markus Krötzsch

On 01.09.2015 16:57, Thiemo Mättig wrote:

Hi,

 > I now identified another format for API warnings [...] from action
"watch"

I'm not absolutely sure, but I think this is not really an other format.
The "warnings" field contains a list of modules. For each module you can
either have a list of "messages", or a plain string. In the later case
the string is stored with the "*" key you see in both examples.

The relevant code that creates these "warnings" structures for Wikibase
can be found in the ApiErrorReporter class.

The { "name": ..., "parameters": ..., "html": { "*": ... } } thing you
see is a rendering of a Message object. The "html" key can be seen in
\ApiErrorFormatter::addWarningOrError, the "*" is a result from the
Message class.

Hope that helps.


Yes, this is very helpful. Thanks. I had looked at this PHP code, but I 
could not see these things there (strings like "*" are not very 
distinctive, so I was not sure which "*" I am looking at ;-).


Markus


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] API JSON format for warnings

2015-09-01 Thread Thiemo Mättig
Hi,

> I now identified another format for API warnings [...] from action "watch"

I'm not absolutely sure, but I think this is not really an other format.
The "warnings" field contains a list of modules. For each module you can
either have a list of "messages", or a plain string. In the later case the
string is stored with the "*" key you see in both examples.

The relevant code that creates these "warnings" structures for Wikibase can
be found in the ApiErrorReporter class.

The { "name": ..., "parameters": ..., "html": { "*": ... } } thing you see
is a rendering of a Message object. The "html" key can be seen in
\ApiErrorFormatter::addWarningOrError, the "*" is a result from the Message
class.

Hope that helps.

Best
Thiemo
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] API JSON format for warnings

2015-09-01 Thread Tom Morris
On Tue, Sep 1, 2015 at 7:45 AM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> I now identified another format for API warnings.


Obviously such variability in error reporting is going to cause consumers
of the API aggravation.  Perhaps a single, consistent reporting style could
be developed.

Tom
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] API JSON format for warnings

2015-09-01 Thread Markus Krötzsch
I now identified another format for API warnings. For example, I got the 
following from action "watch":


"warnings": {
"watch": {
"*": "The title parameter has been deprecated."
}
}

For comparison, here is again what I got from wbeditentity:

"warnings" :
  {"wbeditentity":
{"messages":
   [{"name":"wikibase-self-conflict-patched",
 "parameters":[],
 "html": { "*":"Your edit was patched into the latest version, 
overriding some of your own intermediate changes."}

   }]
}
  }

For action "paraminfo", I managed to trigger multiple warnings. Guess 
how three warnings are reported there!


"warnings": {
"paraminfo": {
"*": "The mainmodule parameter has been deprecated.\nThe 
pagesetmodule parameter has been deprecated.\nThe module \"main\" does 
not have a submodule \"foo\""

}
}

I will now implement these two forms. The "html":{"*":"..."} form seems 
a bit risky to implement (will it always be "html"? will it always be 
"*"?), but I could not get any other warning in such a form, so this is 
the one I will support.


I wasted some time with trying to trace this in the PHP code, but did 
not get to the point where the "messages" or even "html" key is 
inserted. I got as far as ApiResult.php, where messages end up being 
added in addValue(). It seems that this is the same for all modules, 
more or less. I lost the trace after this. I have no idea what happens 
with the thus "added" messages or how they might surface again elsewhere 
in this code. There are various JsonFormatter classes but they are very 
general and do not mention "messages". Neither do the actual ApiMessage 
objects.


Markus



On 30.08.2015 14:22, Markus Krötzsch wrote:



A partial answer would also be helpful (maybe some of my questions are
more tricky than others).

Thanks,

Markus


On 28.08.2015 10:41, Markus Krötzsch wrote:

Hi,

I am wondering how errors and warnings are reported through the API, and
which errors and warnings are possible. There is some documentation on
Wikidata errors [1], but I could not find documentation on how the
warning messages are communicated in JSON. I have seen structures like
this:

{ "warnings" :
   {"wbeditentity":
 {"messages":
[{"name":"wikibase-self-conflict-patched",
  "parameters":[],
  "html": { "*":"Your edit was patched into the latest version,
overriding some of your own intermediate changes."}
}]
 }
   }
}

I don't know how to provoke more warnings, or multiple warnings in one
request, so I found it hard to guess how this pattern generalises. Some
questions:

* What is the purpose of the map with the "*" key? Which other keys but
"*" could this map have?
* The key "wbeditentity" points to a list. Is this supposed to encode
multiple warnings of this type?
* I guess the "name" is a message name, and "parameters" are message
"arguments" (as they are called in action="query") for the message?
* Is this the JSON pattern used in all warnings or can there also be
other responses from wbeditentity?
* Is this the JSON pattern used for warnings in all Wikibase actions or
can there also be other responses from other actions?
* Is there a list of relevant warning codes anywhere?
* Is there a list of relevant error codes anywhere? The docs in [1]
point to paraminfo (e.g.,
http://www.wikidata.org/w/api.php?action=paraminfo&modules=wbeditentity)
but there are no errors mentioned there.

Thanks,

Markus

[1] https://www.mediawiki.org/wiki/Wikibase/API#Errors

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech





___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] how is the datetime value with precision of one year stored

2015-09-01 Thread Markus Krötzsch

On 01.09.2015 10:53, Richard Light wrote:



On 01/09/2015 09:26, Markus Krötzsch wrote:

On 01.09.2015 05:17, Stas Malyshev wrote:

Hi!


I would have thought that the correct approach would be to encode these
values as gYear, and just record the four-digit year.


While we do have a ticket for that
(https://phabricator.wikimedia.org/T92009) it's not that simple since
many triple stores consider dateTime and gYear to be completely
different types and as such some queries between them would not work.



I agree. Our original RDF exports in Wikidata Toolkit are still using
gYear, but I am not sure that this is a practical approach. In
particular, this does not solve the encoding of time precisions in
RDF. It only introduces some special cases for year (and also for
month and day), but it cannot be used to encode decades, centuries, etc.

My current view is that it would be better to encode the actual time
point with maximal precision, and to keep the Wikidata precision
information independently. This applies to the full encoding of time
values (where you have a way to give the precision as a separate value).

For the simple encoding, where the task is to encode a Wikidata time
in a single RDF literal, things like gYear would make sense. At least
full precision times (with time of day!) would be rather misleading
there.

In any case, when using full precision times for cases with limited
precision, it would be good to create a time point for RDF based on a
uniform rule. Easiest option that requires no calendar support: use
the earliest second that is within the given interval. So "20th
century" would always lead to the time point "1900-01-01T00:00:00". If
this is not done, it will be very hard to query for all uses of "20th
century" in the data.

This is an issue which the cultural heritage community has been dealing
with for decades (:-) ).

In short, a single date is never going to do an adequate job of
representing (a) a period over which an event happened and (b)
uncertainty over the start and/or end point in this period.  These
periods will almost never neatly fit into years, decades, centuries,
etc.: these are just a convenience for grouping approximations
together.  Representing e.g. '3.1783 - 12.1820' as either decades or
centuries is going to give a very misleading version of what you
actually know about the period (and you still can't reduce it to a
single 'date thing').

I think that you need at least two dates to represent historical event
dating with any sort of honesty and flexibility.  What those dates
should be is a matter for discussion: the CIDOC CRM for example has the
concept of "ongoing throughout" and "at some time within", which are
respectively the minimal and maximal periods associated with an event.
Common museum practice in the U.K. is to record 'start date' and 'end
date', each with a possible qualification as regards its precision.


Similar considerations have influenced Wikidata to some extent: there 
are hidden "before" and "after" parameters for each time, which are 
intended to create a time interval around a "main" value. The idea, as I 
understand, was that "before" and "after" are non-negative integer 
numbers that specify the number of  units for which the 
interval extends. For example, with precision set to "day", this would 
be numbers of whole days.


So far, this has not been implemented on the UI level, and many existing 
"before" and "after" values are somewhat random and cannot be used. My 
proposal would correspond to use the time point in such a way that it 
would fit to "before"=0 and "after"=1 to yield the current 
coarse-grained notion of precision.


In any case, it is clear that imprecise times on Wikidata always have an 
"at some time within" semantics. "Ongoing throughout" is captured by 
specifying "start date" and "end date" as you can see it on many statements.


Markus



___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


--
*Richard Light*


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech




___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] how is the datetime value with precision of one year stored

2015-09-01 Thread Richard Light



On 01/09/2015 09:26, Markus Krötzsch wrote:

On 01.09.2015 05:17, Stas Malyshev wrote:

Hi!


I would have thought that the correct approach would be to encode these
values as gYear, and just record the four-digit year.


While we do have a ticket for that
(https://phabricator.wikimedia.org/T92009) it's not that simple since
many triple stores consider dateTime and gYear to be completely
different types and as such some queries between them would not work.



I agree. Our original RDF exports in Wikidata Toolkit are still using 
gYear, but I am not sure that this is a practical approach. In 
particular, this does not solve the encoding of time precisions in 
RDF. It only introduces some special cases for year (and also for 
month and day), but it cannot be used to encode decades, centuries, etc.


My current view is that it would be better to encode the actual time 
point with maximal precision, and to keep the Wikidata precision 
information independently. This applies to the full encoding of time 
values (where you have a way to give the precision as a separate value).


For the simple encoding, where the task is to encode a Wikidata time 
in a single RDF literal, things like gYear would make sense. At least 
full precision times (with time of day!) would be rather misleading 
there.


In any case, when using full precision times for cases with limited 
precision, it would be good to create a time point for RDF based on a 
uniform rule. Easiest option that requires no calendar support: use 
the earliest second that is within the given interval. So "20th 
century" would always lead to the time point "1900-01-01T00:00:00". If 
this is not done, it will be very hard to query for all uses of "20th 
century" in the data.
This is an issue which the cultural heritage community has been dealing 
with for decades (:-) ).


In short, a single date is never going to do an adequate job of 
representing (a) a period over which an event happened and (b) 
uncertainty over the start and/or end point in this period.  These 
periods will almost never neatly fit into years, decades, centuries, 
etc.: these are just a convenience for grouping approximations 
together.  Representing e.g. '3.1783 - 12.1820' as either decades or 
centuries is going to give a very misleading version of what you 
actually know about the period (and you still can't reduce it to a 
single 'date thing').


I think that you need at least two dates to represent historical event 
dating with any sort of honesty and flexibility.  What those dates 
should be is a matter for discussion: the CIDOC CRM for example has the 
concept of "ongoing throughout" and "at some time within", which are 
respectively the minimal and maximal periods associated with an event.  
Common museum practice in the U.K. is to record 'start date' and 'end 
date', each with a possible qualification as regards its precision.


Richard



Markus

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


--
*Richard Light*
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] how is the datetime value with precision of one year stored

2015-09-01 Thread Markus Krötzsch

On 01.09.2015 05:17, Stas Malyshev wrote:

Hi!


I would have thought that the correct approach would be to encode these
values as gYear, and just record the four-digit year.


While we do have a ticket for that
(https://phabricator.wikimedia.org/T92009) it's not that simple since
many triple stores consider dateTime and gYear to be completely
different types and as such some queries between them would not work.



I agree. Our original RDF exports in Wikidata Toolkit are still using 
gYear, but I am not sure that this is a practical approach. In 
particular, this does not solve the encoding of time precisions in RDF. 
It only introduces some special cases for year (and also for month and 
day), but it cannot be used to encode decades, centuries, etc.


My current view is that it would be better to encode the actual time 
point with maximal precision, and to keep the Wikidata precision 
information independently. This applies to the full encoding of time 
values (where you have a way to give the precision as a separate value).


For the simple encoding, where the task is to encode a Wikidata time in 
a single RDF literal, things like gYear would make sense. At least full 
precision times (with time of day!) would be rather misleading there.


In any case, when using full precision times for cases with limited 
precision, it would be good to create a time point for RDF based on a 
uniform rule. Easiest option that requires no calendar support: use the 
earliest second that is within the given interval. So "20th century" 
would always lead to the time point "1900-01-01T00:00:00". If this is 
not done, it will be very hard to query for all uses of "20th century" 
in the data.


Markus

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech