Temlakos, I am travelling with infrequent email access, so I only now have a mail to send. First of all, thanks a lot for your terrific efforts! International timezones and am/pm are very useful. Calendar models are also great, my only concern being that it is sufficiently clear for the user which calendar model is actually used (I can imagine potential confusion between "am" and "AM" for example).
One more thing I would like to know is how you store dates internally. Currently, each date has a sortkey that simply provides an (approximate) order for comparing dates, and a precise string representation that is an easily parsed record of the proleptic Georgian version of the date. Do you intend to still use this format internally (this would not mean that the users would have to see proleptic Georgian in the output, of course)? Or do you extend the format to also store calendar formats? There is a problem with the first approach: current dates preserve imprecision, e.g. if only a year but no month or day is given, then this will faithfully be represented in the database (instead of assuming Jan 1st to be the implicit date). Now when converting times into another calendar model, this imprecision cannot be maintained: you need a precise point in time to convert. There seem to be three options: (1) If we want to treat the precision very accurately, then a single year would represent an interval (Jan 1 00:00 to Dec 31 23:59). Now when converting between calendars, this interval unfortunately changes and no longer starts and ends at the same time as the year. So if we were to represent this interval in proleptic Georgian, we may need to store both boundaries of the interval. (2) However, this precision may not be needed: not giving months or days usually just indicates some amount of uncertainty, and is not meant as a strict mathematical interval. So it might suffice to do a conversion with a precise default date (Jan 1st) and then, after the conversion, reduce the result to the same amount of imprecision (e.g. prune the month and day if they were not given in the original input). This would not affect the sorting of dates, and it might still lead to a fairly natural representation. "Conversion drift" is a problem here: when converting the stored (converted) value back to the original (input) calendar model, you could end up with a date that is not the same as the one that was given originally. This effect could mostly be avoided when assuming "central" times for the conversion, e.g. use Jul 1 instead of Jan 1 as a default. (3) Finally, one could make the calendar model part of the DB representation. One could still maintain comparability of all dates by using a converted value for computing the sortkey. But working with non-Gregorian times in other places internally may require further changes in the code (I would have to check that). It seems to me that (3) might actually be the best solution here, although one needs to be careful since this probably changes some of the default assumptions of the current code. In any case, care must be taken to ensure that the existing database entries are still understood properly after upgrading SMW. Greetings from Oxford, Markus On Montag, 27. Juli 2009, Temlakos wrote: > Everyone: > > All right, I have my array, my regex, and an adaptation of a > filtered-value handler. I'd like everyone to examine this code: > > static private $m_tz = array("A" => 1, "ACDT" => 10.5, "ACST" => 9.5, > "ADT" => -3, "AEDT" => 11, > "AEST" => 10, "AKDT" => -8, "AKST" => -9, "AST" => -4, "AWDT" => 9, > "AWST" => 8, > "B" => 2, "BST" => 1, "C" => 3, "CDT" => -5, "CEDT" => 2, "CEST" => 2, > "CET" => 1, "CST" => -6, "CXT" => 7, "D" => 4, "E" => 5, "EDT" => -4, > "EEDT" => 3, "EEST" => 3, "EET" => 2, "EST" => -5, "F" => 6, "G" => 7, > "GMT" => 0, "H" => 8, "HAA" => -3, "HAC" => -5, "HADT" => -9, "HAE" > => -4, > "HAP" => -7, "HAR" => -6, "HAST" => -10, "HAT" => -2.5, "HAY" => -8, > "HNA" => -4, "HNC" => -6, "HNE" => -5, "HNP" => -8, "HNR" => -7, > "HNT" => -3.5, > "HNY" => -9, "I" => 9, "IST" => 1, "K" => 10, "L" => 11, "M" => 12, > "MDT" => -6, "MESZ" => 2, "MEZ" => 1, "MSD" => 4, "MSK" => 3, "MST" > => -7, > "N" => -1, "NDT" => -2.5, "NFT" => 11.5, "NST" => -3.5, "O" => -2, > "P" => -3, > "PDT" => -7, "PST" => -8, "Q" => -4, "R" => -5, "S" -> -6, "T" => -7, > "U" => -8, "UTC" => 0, "V" => -9, "W" => -10, "WDT" => 9, "WEDT" => 1, > "WEST" => 1, "WET" => 0, "WST" => 8, "X" => -11, "Y" => -12, "Z" => 0); > > static private $regexptz = > "/A(([CEKW]?[DS])T)?|B(ST)?|CXT|[CEW](([DES]|E[DS])T)?|[DFKLOQRSTVXYZ]|" . > > "G(MT)|H(A[DS]T|[AN][ACEPRTY])?|I(ST)?|M(DT|E(S)?Z|S[DKT])?|N([DFS]T)?|P([D >S]T)?|U(TC)?/u" > > //..... > > //browse string in advance for timezone monikers ("EST", "WET", "MESZ", > etc.) > if(preg_match($regexptz, $filteredvalue, $match)) { > // Retrieve the offset and store it as the initial time offset value. > $this->m_timeoffset = $this->m_timeoffset + ($m_tz[$match[0]] / 24); > $regexp = "/(\040|T){0,1}".str_replace("+", "\+", > $match[0])."(\040){0,1}/u"; //delete tz moniker and preceding and > following chars > $filteredvalue = preg_replace($regexp,'', $filteredvalue); //value > without the tz moniker > } > > and tell me whether anything in it will produce results other than what > I intend. > > For now, let us assume that people annotate their dates with the correct > time-zone monikers and don't misspell anything. (I already know that > specifying "CWT" instead of "CXT" will likely return an offset of 3 > hours, instead of the intended 7 hours. But it would also leave in place > an uninterpretable string, which would cause the day/month/year parser > to throw an error. That kind of unpredictable result encourages the user > to correct his article code, so I don't mind that at all.) > > Basically, I intend, with this code, to store time offsets based on a > standard international time zone moniker, if a user chose to use that > instead of an offset written as hours and minutes. I also wish to strip > that moniker out of the filtered value before it gets to the > month/day/year parser. To give you an idea of placement: the > static-private array and variable declarations go directly below the > member-variable declarations, and the browser code appears immediately > before the time-value browser. Of course, a user might put in a time > zone moniker /and/ a numerical offset, in which case the two offsets > will add up and produce a "resultant offset" to the annotated time. Thus > if anyone were to annotate a date and time using: > > "5:02 -4:00 pm EDT" > > the result would be equivalent to 01:02 UTC, not 21:02 UTC as perhaps > intended. > > Notice that this system supports the conventional way of specifying a > time-zone moniker. Hence it supports the symbol "MSD" and not "Moscow > Summer Time." Similarly, it supports the letters A-I and K-Z and /not/ > the spelled-out military alphabet words Alpha, Bravo, Charlie, and so on > up to Zulu. (J for Juliet always refers to local time and does not > designate any specific time zone.) > > Now I'm ready to turn my attention to the Julian-day-based calculation > system. I also wish to preserve the existing system for storing a > "compressed" value for dates that fall earlier than the "Julian Period" > (4713 BC, or 4713 vdZ if you prefer). So long as I can reliably flag an > input of a year earlier than 4713 BC (and I believe I can), I can use > that as a warning /not/ to invoke my secondary-calendar calculators. (Of > course, I don't expect anyone to try to annotate a date earlier than 1 > in any given calendar, except for the range 4713 to 1 BC.) > > I appreciate Patrick Nagel's earlier comment about 12-hour v. 24-hour > clock conventions in Chinese writing. If any of you have any further > insights, please share them. > > Temlakos > > > --------------------------------------------------------------------------- >--- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day trial. Simplify your report design, integration and deployment - and > focus on what you do best, core application coding. Discover what's new > with Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Semediawiki-devel mailing list > Semediawiki-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/semediawiki-devel -- Markus Krötzsch Semantic MediaWiki http://semantic-mediawiki.org http://korrekt.org mar...@semantic-mediawiki.org
signature.asc
Description: This is a digitally signed message part.
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel