Re: Representing ambiguity in datetime?

2005-05-18 Thread Andrew Dalke
Ron Adam wrote:
 This is a very common problem in genealogy research as well as other 
 sciences that deal with history, such as geology, geography, and archeology.
  ..
 So it seems using 0's for the missing day or month may be how to do it.

Except of course humans like to make things more complicated than that.
Some journals are published quarterly so an edition might be Jan-Mar.
Some countries refer to week numbers, so an event might be in week 12.

I offer no suggestions as to how to handle these cases.

Andrew
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representing ambiguity in datetime?

2005-05-18 Thread Dan Christensen
Ron Adam [EMAIL PROTECTED] writes:

 So it seems using 0's for the missing day or month may be how to do it.

This doesn't allow more specific amounts of ambiguity.  I suggest
either a pair of dates, which represent the earliest and latest that
the event could have been (and are equal if there is no ambiguity),
or a date plus a number of days of uncertainty, i.e. 
21 June 2005 +- 5 days.

Dan
-- 
http://mail.python.org/mailman/listinfo/python-list


Representing ambiguity in datetime?

2005-05-17 Thread Terry Hancock
What do you do when a date or time is
incompletely specified?  ISTM, that as it is, there is no
formal way to store this --- you have to guess, and there's
no way to indicate that the guess is different from solid
information.  As a result, I have sometimes had to abandon
datetime, even though it seemed like the logical choice for
representing data.

E.g. I might have information like this paper was published
in May 1997.  There's no way to write that with datetime,
is there?  Even if I just use the date object instead of 
datetime, I still have to actually specify something like 
May 1, 1997 --- fabricating data, which is frequently
undesireable (later on, I might find information saying that
it was actually published May 23, 1997 and I might want
to update the earlier one, or simply evaluate them as 
equal since they are, to within the precision given --- 
for example, I might be trying to decide that two database
entries are really duplicate references to the same paper).

I know that this is somewhat theoretically stated, but I 
have run into to concrete problems along the lines of
the above.

I'd say this is analogous to how you might use None
rather than 0 to represent an integer if you don't know
it's value (rather than knowing that it is zero).  ISTM, you
ought to be able to specify a date as, e.g.:

d = datetime.date(2005, 5, None)

I realize there might be some complexity with deciding
how to handle datestamp math, but as this situation
occurs frequently in real life, it seems like it shouldn't
be avoided.

How do other people deal with this kind of problem?

Cheers,
Terry

--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks  http://www.anansispaceworks.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representing ambiguity in datetime?

2005-05-17 Thread Steve Holden
Terry Hancock wrote:
 What do you do when a date or time is
 incompletely specified?  ISTM, that as it is, there is no
 formal way to store this --- you have to guess, and there's
 no way to indicate that the guess is different from solid
 information.  As a result, I have sometimes had to abandon
 datetime, even though it seemed like the logical choice for
 representing data.
 
 E.g. I might have information like this paper was published
 in May 1997.  There's no way to write that with datetime,
 is there?  Even if I just use the date object instead of 
 datetime, I still have to actually specify something like 
 May 1, 1997 --- fabricating data, which is frequently
 undesireable (later on, I might find information saying that
 it was actually published May 23, 1997 and I might want
 to update the earlier one, or simply evaluate them as 
 equal since they are, to within the precision given --- 
 for example, I might be trying to decide that two database
 entries are really duplicate references to the same paper).
 
 I know that this is somewhat theoretically stated, but I 
 have run into to concrete problems along the lines of
 the above.
 
 I'd say this is analogous to how you might use None
 rather than 0 to represent an integer if you don't know
 it's value (rather than knowing that it is zero).  ISTM, you
 ought to be able to specify a date as, e.g.:
 
 d = datetime.date(2005, 5, None)
 
 I realize there might be some complexity with deciding
 how to handle datestamp math, but as this situation
 occurs frequently in real life, it seems like it shouldn't
 be avoided.
 
 How do other people deal with this kind of problem?
 
It's not a problem I've had to deal with, but it seems that the simplest 
way to handle it would be to store the value as a datetime and a format. 
That way you can explicitly use 1 or 0 as a default (as appropriate) to 
create the datetime and then ensure it's represented only to the 
required degree of precision by using the supplied format.

Don't know whether this will work for you, just a thought.

regards
  Steve
-- 
Steve Holden+1 703 861 4237  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representing ambiguity in datetime?

2005-05-17 Thread John Machin
On Tue, 17 May 2005 17:38:30 -0500, Terry Hancock
[EMAIL PROTECTED] wrote:

What do you do when a date or time is
incompletely specified?  ISTM, that as it is, there is no
formal way to store this --- you have to guess, and there's
no way to indicate that the guess is different from solid
information.  As a result, I have sometimes had to abandon
datetime, even though it seemed like the logical choice for
representing data.

E.g. I might have information like this paper was published
in May 1997.  There's no way to write that with datetime,
is there?  Even if I just use the date object instead of 
datetime, I still have to actually specify something like 
May 1, 1997 --- fabricating data, which is frequently
undesireable (later on, I might find information saying that
it was actually published May 23, 1997 and I might want
to update the earlier one, or simply evaluate them as 
equal since they are, to within the precision given --- 
for example, I might be trying to decide that two database
entries are really duplicate references to the same paper).

I know that this is somewhat theoretically stated, but I 
have run into to concrete problems along the lines of
the above.

I'd say this is analogous to how you might use None
rather than 0 to represent an integer if you don't know
it's value (rather than knowing that it is zero).  ISTM, you
ought to be able to specify a date as, e.g.:

d = datetime.date(2005, 5, None)

I realize there might be some complexity with deciding
how to handle datestamp math, but as this situation
occurs frequently in real life, it seems like it shouldn't
be avoided.

How do other people deal with this kind of problem?

Mostly, badly :-(

Real-life example: due to war-time disruption etc, in some countries
it is common enough to find that the date of birth of someone born in
the 1940s is not known precisely. E.g. on the Hong Kong identity card,
it is possible to find only the year and month of birth, and sometimes
even only the year. Depending on the purpose, legislation and
convention will take the first day of the vague period or the last day
when a calculation is required. Badly == entering into a database the
exact date that was used for the purpose du jour, with no indication
that the source was vague. Consequently a person can have DOB recorded
as 1945-01-01 on one database and 1945-12-31 on another.

Suggested approach in Python (sketch): Don't try to get the datetime
module to solve the problem. Define a fuzzydate class. Internal
representation: I'd suggest earliest possible date and latest possible
date. That way you have valid date instances for doing date
arithmetic. May have different constructors depending on how the
incoming vagueness is specified. 

HTH,
John
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representing ambiguity in datetime?

2005-05-17 Thread Ron Adam
John Machin wrote:

 On Tue, 17 May 2005 17:38:30 -0500, Terry Hancock
 [EMAIL PROTECTED] wrote:
 
 
What do you do when a date or time is
incompletely specified?  ISTM, that as it is, there is no
formal way to store this --- you have to guess, and there's
no way to indicate that the guess is different from solid
information.  As a result, I have sometimes had to abandon
datetime, even though it seemed like the logical choice for
representing data.

E.g. I might have information like this paper was published
in May 1997.  There's no way to write that with datetime,
is there?  Even if I just use the date object instead of 
datetime, I still have to actually specify something like 
May 1, 1997 --- fabricating data, which is frequently
undesireable (later on, I might find information saying that
it was actually published May 23, 1997 and I might want
to update the earlier one, or simply evaluate them as 
equal since they are, to within the precision given --- 
for example, I might be trying to decide that two database
entries are really duplicate references to the same paper).

I know that this is somewhat theoretically stated, but I 
have run into to concrete problems along the lines of
the above.

I'd say this is analogous to how you might use None
rather than 0 to represent an integer if you don't know
it's value (rather than knowing that it is zero).  ISTM, you
ought to be able to specify a date as, e.g.:

d = datetime.date(2005, 5, None)

I realize there might be some complexity with deciding
how to handle datestamp math, but as this situation
occurs frequently in real life, it seems like it shouldn't
be avoided.

How do other people deal with this kind of problem?
 
 
 Mostly, badly :-(
 
 Real-life example: due to war-time disruption etc, in some countries
 it is common enough to find that the date of birth of someone born in
 the 1940s is not known precisely. E.g. on the Hong Kong identity card,
 it is possible to find only the year and month of birth, and sometimes
 even only the year. Depending on the purpose, legislation and
 convention will take the first day of the vague period or the last day
 when a calculation is required. Badly == entering into a database the
 exact date that was used for the purpose du jour, with no indication
 that the source was vague. Consequently a person can have DOB recorded
 as 1945-01-01 on one database and 1945-12-31 on another.
 
 Suggested approach in Python (sketch): Don't try to get the datetime
 module to solve the problem. Define a fuzzydate class. Internal
 representation: I'd suggest earliest possible date and latest possible
 date. That way you have valid date instances for doing date
 arithmetic. May have different constructors depending on how the
 incoming vagueness is specified. 
 
 HTH,
 John


This is a very common problem in genealogy research as well as other 
sciences that deal with history, such as geology, geography, and archeology.

I agree that some standard way of dealing with fuzzy dates would be a 
good thing.  I think looking at how others do it would be the way to 
start...

A google search found the following reference buried in a long reference 
page on mysql.

http://www.dreamlink.net/mysql/manual_Functions.html

 The reason the ranges for the month and day specifiers begin
with zero is that MySQL allows incomplete dates such as
'2004-00-00' to be stored as of MySQL 3.23.


So it seems using 0's for the missing day or month may be how to do it.

Cheers,
_Ron


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representing ambiguity in datetime?

2005-05-17 Thread Ivan Van Laningham
Hi All--

Ron Adam wrote:
 
 John Machin wrote:
 
  On Tue, 17 May 2005 17:38:30 -0500, Terry Hancock
  [EMAIL PROTECTED] wrote:
 
 
 What do you do when a date or time is
 incompletely specified? 

  The reason the ranges for the month and day specifiers begin
 with zero is that MySQL allows incomplete dates such as
 '2004-00-00' to be stored as of MySQL 3.23.
 
 So it seems using 0's for the missing day or month may be how to do it.
 

This is somewhat the approach I took in order to allow users to specify
an incomplete Mayan date in order to list possibilities.  But instead of
0 (which is a valid entry in most Mayan date components), I used None. 
The web version can be found at

http://www.pauahtun.org/Calendar/tools.html (the Search for Matching
Dates button)

The paper describing the incomplete Mayan date tool is at: 
http://www.pauahtun.org/python_vuh.html

Metta,
Ivan
--
Ivan Van Laningham
God N Locomotive Works
http://www.andi-holmes.com/
http://www.foretec.com/python/workshops/1998-11/proceedings.html
Army Signal Corps:  Cu Chi, Class of '70
Author:  Teach Yourself Python in 24 Hours
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representing ambiguity in datetime?

2005-05-17 Thread Peter Hansen
Terry Hancock wrote:
 What do you do when a date or time is
 incompletely specified?  

Doesn't the answer to this pretty much entirely depend on how you are 
going to make use of the information?  What are your use cases?
-- 
http://mail.python.org/mailman/listinfo/python-list