I am still trying to catch up with the whole discussion and to distill the
results, both here and on the wiki.

In the meanwhile, I have tried to create a prototype of how a complex model
can still be entered in a simple fashion. A simple demo can be found here:

<http://simia.net/valueparser/>

The prototype is not i18n.

The user has to enter only the value, in a hopefully intuitive way (try it
out), and the full interpretation is displayed here (that, alas, is not
intuitive, admittedly).

Cheers,
Denny





2012/12/20 <jmccl...@hypergrove.com>

> **
>
> (Proposal 3, modified)
> * value (xsd:double or xsd:decimal)
>
> * unit (a wikidata item)
>
> * totalDigits (xsd:smallint)
> * fractionDigits (xsd:smallint)
> * originalUnit (a wikidata item)
> * originalUnitPrefix (a wikidata item)
> JMc: I rearranged the list a bit and suggested simpler naming
>
> JMc: Is not originalUnitPrefix directly derived from originalUnit?
>
> JMc: May be more efficient to store not reconstruct the original value. May 
> even be better to store the original value somewhere else entirely, earlier 
> in the process, eg within the context that you indicate would be worthwhile 
> to capture, because I wouldnt expect alot of retrievals, but you anticipate 
> usage patterns certainly better than I.
>
> How about just:
>
>
> Datatype: .number  (Proposal 4)
>
> -----------------------------------------
>   :value (xsd:double or xsd:decimal)
>
>   :unit (a wikidata item)
>   :totalDigits (xsd:smallint)
>   :fractionDigits (xsd:smallint)
>
>
>   :original (a wikidata item that is a number object)
>
> On 20.12.2012 03:08, Gregor Hagedorn wrote:
>
> On 20 December 2012 02:20,  <jmccl...@hypergrove.com> wrote:
>
> For me the question is how to name the precision information. Do not the
> XSD facets "totalDigits" and "fractionDigits" work well enough? I mean
>
> Yes, that would be one way of modeling it. And I agree with you that,
> although the xsd attributes originally are devised for datatypes,
> there is nothing wrong with re-using it for quantities and
> measurements.
>
> So one way of expressing a measurement with significant digits is:
> (Proposal 1)
> * normalizedValue
> * totalDigits
> * fractionDigits
> * originalUnit
> * normalizedUnit
>
> To recover the original information (e.g. that the original value was
> in feet with a given number of significant digits) the software must
> convert normalizedUnit to originalUnit, scale to totalDigits with
> fractionDigits, calculate the remaining powers of ten, and use some
> information that must be stored together with each unit whether this
> then should be expressed using an SI unit prefix (the Exa, Tera, Giga,
> Mega, kilo, hekto, deka, centi, etc.). Some units use them, others
> not, and some units use only some. Hektoliter is common, hektometer
> would be very odd. This is slightly complicated by the fact that for
> some units prefix usage in lay topics differs from scientific use.
>
> If all numbers were expressed ONLY as total digits with fraction
> digits and unit-prefix, i.e. no power-of-ten exponential, the above
> would be sufficiently complete. However, without additional
> information it does not allow to recover the entry:
>
> 100,230 * 10^3 tons
> (value 1.0023e8, 6 total, 3 fractional digits, original unit tons,
> normalized unit gram)
>
> I had therefore made (on the wiki) the proposal to express it as:
>
> (Proposal 2)
> * normalizedValue
> * significantDigits (= and I am happy with totalDigits instead)
> * originalUnit
> * originalUnitPrefix
> * normalizedUnit
>
> However I see now that the analysis was wrong, indeed it needs
> fractionDigits in addition to totalDigits, else a similar problem may
> occur, i.e. the distribution of the total order of magnitude of the
> number between non-fractional digits, fractional digits, powers of 10
> and powers-of-10-expressed through SI units is still not unambigous.
>
> So the minimal representation seems to be:
>
> (Proposal 3)
> * normalizedValue (xsd:double or xsd:decimal)
> * totalDigits (xsd:smallint)
> * fractionDigits (xsd:smallint)
> * originalUnit (a wikidata item)
> * originalUnitPrefix (a wikidata item)
> * normalizedUnit (a wikidata item)
>
> Adding the originalUnitPrefix has the advantage that it gathers
> knowledge from users and data creators or resources about which unit
> prefix is appropriate in a given context.
>
> I see the current wikidata plan to solve this problem by heuristics
> very critical, I do not see the data set that sufficiently tests the
> heuristics yet. Gathering information from data entered and creating a
> formatting heuristics modules over the coming years (instead of weeks)
> will be valuable for reformatting. The Proposal 3 allows to gather
> this information.
>
> Gregor
>
> Note 1: The question of other means to express accuracy or precision,
> e.g. by error margins, statistical measures of spread such as
> variance, confidence intervals, percentiles, min/max etc. is not yet
> covered.
>
> Given the present discussion, this should probably be separately agreed upon.
>
> Note 2: Wikipedia Infoboxes may desire to override it, this is for
> data entering, review, curation, and a default display where no other
> is defined
>
> _______________________________________________
> Wikidata-l mailing 
> listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to