On Thu, Mar 6, 2008 at 4:38 PM, Markus Krötzsch
<[EMAIL PROTECTED]> wrote:
> First of all: thanks a lot. Comments inline.
>
>
>
>  On Donnerstag, 6. März 2008, Louis Gerbarg wrote:
>  > I was getting errors trying to store dates in the mid 2290s...
>
>  I thought my schedule was packed with meetings, and you are already planning
>  until the 2290s. What kind of site is that?

I have been experimenting with annotating information about some
science fiction series. Some of that takes place far in the future.
Nothing is on a public facing server, and I have no idea if I will
actually do anything useful with this, but it is more about me working
through some issues with categorization and templating properties than
this particular data set.

>  > here is
>  > a patch that remedies this. It was tested under php 5.2.5 on Mac OS X
>  > 10.5.2 (32 bit). The patch should run correctly on php >= 5.1.3,
>  > though I suspect it will not actually give people the extended ranges
>  > unless you are on more recent releases due to fixes in the DateTime
>  > class. If I did everything correctly it should still accept all the
>  > same input formats it previously did, as well as store everything in
>  > the database the same way, so no data conversion should be necessary,
>  > but I did not test that extensively.
>
>  How does that work? Can the DB handle 64bit large numbers? Is there a
>  performance hit associated to that?

The existing date implementation stores it in the DB as an XSD
formatted string and double precision float, so there is no particular
efficacy issue one way or another. Altering it save it simply as an
integer (32 or 64) would probably be a bit of a performance win, but
extending the range should not negatively impact what is currently
being done. Since the numeric value in the DB is a double it cannot
actually store a full 64bit range, but it can safely store well past
the value of a 32 bit integer (~50ish bits if I recall correctly).
Depending on exactly how you are using the numeric representation vs
XSD representation there could be an interesting failure mode when the
numeric overflows but the XSD is still within 64 bits. (100+ million
years).

Strictly speaking, whether or not using a double field like that is
clean is sort of moot, because anyone running a 64 bit build of php
5.2.6-snap is going to end up with strtotime returning 64 bit values,
which should cause the same values as I am generating to end up in the
DB, even without my code changes. If it is a problem it is going to
need to be fixed regardless of whether the patch is applied.

>
>  I now have already two patches for extending data ranges: the other one is 
> the
>  Historical Date datatype by Terry A. Hurlbut. I wonder whether this extended
>  datatype could also use your method (AFAIK the historical dates use days
>  instead of seconds internally, right?). I also consider switching to a format
>  that separates parts of dates in different DB fields, so that SMW can
>  distinguish dates without exact day or day time from those dates which just
>  happen to be at a year's 1st of Jan 00:00 ... this would probably also solve
>  the 64bit issue since the components would fit into 32bit.

When you say parts of dates I presume you mean month, day, year, etc.
So long as the parser keeps working with the same wikitext I am not
partial to what I did, I just needed something that worked, and I was
more than happy to share it incase it was useful. I don't really need
anything more than an extension of the current time tracking mechanism
for what I am currently interested in, though the problem you mention
is certainly interesting. The one issue I can forsee with the
representation you are describing is that the boundaries people find
interesting are different. Imagine tracking comic books. Some are
published quarterly, monthly, biweekly, etc. Ideally you want some way
to distinguish between "Winter," "January," and "Week 1," all of which
start at the same time.

I think a more general way to solve it would be make a new data type
of DateRange, which was stored as two 64 bit integers. It would
effectively allow you to diminish the precision of a particular date
by separating the two points. Having said that, I don't need that
functionality, and doing it right seems like a lot of work.

Louis

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to