[since it is my first intervention here, I quickly introduce myself: math PhD student, hobbist coder, interested by the semantic universe but I don’t know much than the general ideas for now.]

Le Mon, 02 Apr 2012 15:56:37 +0200, JFC Morfin <jef...@jefsey.com> a écrit:
3. Then, the third problem no one has addressed yet except ISO 3166,
is variance : two identical particulars (effects, names, data, etc.)
may be different. eg. there are many ways to compute and present the
same date. Are the results to be stored in Wikidata in all these ways
every day and bridges to be built? or are they to be stored as a
single data with the formulas to compute them, then how to be sure
some parameters have not changed (i.e. death of the Emperor) and
computation was not tampered with? Variance is everywhere (actually
variance is most probably Life). ISO 3166 has no variance, because it
is the sovereign reference: the list of States and laws languages
(however, Palestine is in it already, Taiwan is there). ISO documents
are in French, English and possibly in Russian. ISO 3166:1 states
which are the normative languages in every country by reference to
ISO 639 (list of language names). ISO 3166 defines the ccTLDs and is
used in langtags to document languages and cultures. ISO 10646
(supported by UNICODE) is the scripts character coded tables. At
binary layer it is full of variants (same graphs being supported by
different code points).

I’m interested in this point since one often encounter on Wikipedia uncertainty/variance about some data: * dates can be known with some uncertainty (e.g. "born between -345 and -342", or "born in 734 or 736, depending of sources") * fixed dates could not be sufficient (e.g. "not born"/"not dead" for some mythological/religious characters, or "eternal" for the Eternal President of the Republic of North Korea) * some physical constants are defined up to a given precision (e.g. Avogadro constant)
* names whose the writing is not fixed because of an oral tradition
* the nationality of some people changed during their life so it cannot be considered in some specific cases as "one" data (e.g. Einstein)

Sébastien

PS: just curious: from what I understood, Unicode define some normalization rules to assure the unicity of a glyph vs code point (form C), no?

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to