Temlakos,

I am travelling with infrequent email access, so I only now have a mail to 
send. First of all, thanks a lot for your terrific efforts! International 
timezones and am/pm are very useful. Calendar models are also great, my only 
concern being that it is sufficiently clear for the user which calendar model 
is actually used (I can imagine potential confusion between "am" and "AM" for 
example).

One more thing I would like to know is how you store dates internally. 
Currently, each date has a sortkey that simply provides an (approximate) order 
for comparing dates, and a precise string representation that is an easily 
parsed record of the proleptic Georgian version of the date. Do you intend to 
still use this format internally (this would not mean that the users would 
have to see proleptic Georgian in the output, of course)? Or do you extend the 
format to also store calendar formats?

There is a problem with the first approach: current dates preserve 
imprecision, e.g. if only a year but no month or day is given, then this will 
faithfully be represented in the database (instead of assuming Jan 1st to be 
the implicit date). Now when converting times into another calendar model, 
this imprecision cannot be maintained: you need a precise point in time to 
convert. There seem to be three options:

(1) If we want to treat the precision very accurately, then a single year 
would represent an interval (Jan 1 00:00 to Dec 31 23:59). Now when converting 
between calendars, this interval unfortunately changes and no longer starts 
and ends at the same time as the year. So if we were to represent this 
interval in proleptic Georgian, we may need to store both boundaries of the 
interval.

(2) However, this precision may not be needed: not giving months or days 
usually just indicates some amount of uncertainty, and is not meant as a 
strict mathematical interval. So it might suffice to do a conversion with a 
precise default date (Jan 1st) and then, after the conversion, reduce the 
result to the same amount of imprecision (e.g. prune the month and day if they 
were not given in the original input). This would not affect the sorting of 
dates, and it might still lead to a fairly natural representation. "Conversion 
drift" is a problem here: when converting the stored (converted) value back to 
the original (input) calendar model, you could end up with a date that is not 
the same as the one that was given originally. This effect could mostly be 
avoided when assuming "central" times for the conversion, e.g. use Jul 1 
instead of Jan 1 as a default.

(3) Finally, one could make the calendar model part of the DB representation. 
One could still maintain comparability of all dates by using a converted value 
for computing the sortkey. But working with non-Gregorian times in other 
places internally may require further changes in the code (I would have to 
check that).

It seems to me that (3) might actually be the best solution here, although one 
needs to be careful since this probably changes some of the default 
assumptions of the current code. In any case, care must be taken to ensure 
that the existing database entries are still understood properly after 
upgrading SMW.

Greetings from Oxford,

Markus


On Montag, 27. Juli 2009, Temlakos wrote:
> Everyone:
>
> All right, I have my array, my regex, and an adaptation of a
> filtered-value handler. I'd like everyone to examine this code:
>
> static private $m_tz = array("A" => 1, "ACDT" => 10.5, "ACST" => 9.5,
> "ADT" => -3, "AEDT" => 11,
>     "AEST" => 10, "AKDT" => -8, "AKST" => -9, "AST" => -4, "AWDT" => 9,
> "AWST" => 8,
>     "B" => 2, "BST" => 1, "C" => 3, "CDT" => -5, "CEDT" => 2, "CEST" => 2,
>     "CET" => 1, "CST" => -6, "CXT" => 7, "D" => 4, "E" => 5, "EDT" => -4,
>     "EEDT" => 3, "EEST" => 3, "EET" => 2, "EST" => -5, "F" => 6, "G" => 7,
>     "GMT" => 0, "H" => 8, "HAA" => -3, "HAC" => -5, "HADT" => -9, "HAE"
> => -4,
>     "HAP" => -7, "HAR" => -6, "HAST" => -10, "HAT" => -2.5, "HAY" => -8,
>     "HNA" => -4, "HNC" => -6, "HNE" => -5, "HNP" => -8, "HNR" => -7,
> "HNT" => -3.5,
>     "HNY" => -9, "I" => 9, "IST" => 1, "K" => 10, "L" => 11, "M" => 12,
>     "MDT" => -6, "MESZ" => 2, "MEZ" => 1, "MSD" => 4, "MSK" => 3, "MST"
> => -7,
>     "N" => -1, "NDT" => -2.5, "NFT" => 11.5, "NST" => -3.5, "O" => -2,
> "P" => -3,
>     "PDT" => -7, "PST" => -8, "Q" => -4, "R" => -5, "S" -> -6, "T" => -7,
>     "U" => -8, "UTC" => 0, "V" => -9, "W" => -10, "WDT" => 9, "WEDT" => 1,
>     "WEST" => 1, "WET" => 0, "WST" => 8, "X" => -11, "Y" => -12, "Z" => 0);
>
> static private $regexptz =
> "/A(([CEKW]?[DS])T)?|B(ST)?|CXT|[CEW](([DES]|E[DS])T)?|[DFKLOQRSTVXYZ]|" .
>
> "G(MT)|H(A[DS]T|[AN][ACEPRTY])?|I(ST)?|M(DT|E(S)?Z|S[DKT])?|N([DFS]T)?|P([D
>S]T)?|U(TC)?/u"
>
> //.....
>
> //browse string in advance for timezone monikers ("EST", "WET", "MESZ",
> etc.)
> if(preg_match($regexptz, $filteredvalue, $match)) {
>     // Retrieve the offset and store it as the initial time offset value.
>     $this->m_timeoffset = $this->m_timeoffset + ($m_tz[$match[0]] / 24);
>     $regexp = "/(\040|T){0,1}".str_replace("+", "\+",
> $match[0])."(\040){0,1}/u"; //delete tz moniker and preceding and
> following chars
>     $filteredvalue = preg_replace($regexp,'', $filteredvalue); //value
> without the tz moniker
> }
>
> and tell me whether anything in it will produce results other than what
> I intend.
>
> For now, let us assume that people annotate their dates with the correct
> time-zone monikers and don't misspell anything. (I already know that
> specifying "CWT" instead of "CXT" will likely return an offset of 3
> hours, instead of the intended 7 hours. But it would also leave in place
> an uninterpretable string, which would cause the day/month/year parser
> to throw an error. That kind of unpredictable result encourages the user
> to correct his article code, so I don't mind that at all.)
>
> Basically, I intend, with this code, to store time offsets based on a
> standard international time zone moniker, if a user chose to use that
> instead of an offset written as hours and minutes. I also wish to strip
> that moniker out of the filtered value before it gets to the
> month/day/year parser. To give you an idea of placement: the
> static-private array and variable declarations go directly below the
> member-variable declarations, and the browser code appears immediately
> before the time-value browser. Of course, a user might put in a time
> zone moniker /and/ a numerical offset, in which case the two offsets
> will add up and produce a "resultant offset" to the annotated time. Thus
> if anyone were to annotate a date and time using:
>
> "5:02 -4:00 pm EDT"
>
> the result would be equivalent to 01:02 UTC, not 21:02 UTC as perhaps
> intended.
>
> Notice that this system supports the conventional way of specifying a
> time-zone moniker. Hence it supports the symbol "MSD" and not "Moscow
> Summer Time." Similarly, it supports the letters A-I and K-Z and /not/
> the spelled-out military alphabet words Alpha, Bravo, Charlie, and so on
> up to Zulu. (J for Juliet always refers to local time and does not
> designate any specific time zone.)
>
> Now I'm ready to turn my attention to the Julian-day-based calculation
> system. I also wish to preserve the existing system for storing a
> "compressed" value for dates that fall earlier than the "Julian Period"
> (4713 BC, or 4713 vdZ if you prefer). So long as I can reliably flag an
> input of a year earlier than 4713 BC (and I believe I can), I can use
> that as a warning /not/ to invoke my secondary-calendar calculators. (Of
> course, I don't expect anyone to try to annotate a date earlier than 1
> in any given calendar, except for the range 4713 to 1 BC.)
>
> I appreciate Patrick Nagel's earlier comment about 12-hour v. 24-hour
> clock conventions in Chinese writing. If any of you have any further
> insights, please share them.
>
> Temlakos
>
>
> ---------------------------------------------------------------------------
>--- Let Crystal Reports handle the reporting - Free Crystal Reports 2008
> 30-Day trial. Simplify your report design, integration and deployment - and
> focus on what you do best, core application coding. Discover what's new
> with Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


-- 
Markus Krötzsch
Semantic MediaWiki    http://semantic-mediawiki.org
http://korrekt.org    mar...@semantic-mediawiki.org


Attachment: signature.asc
Description: This is a digitally signed message part.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to