On 21.03.2011 07:40, Piyush Newe wrote:
On Thu, Mar 17, 2011 at 7:56 PM, Tom Lane<t...@sss.pgh.pa.us>  wrote:

Robert Haas<robertmh...@gmail.com>  writes:
On Thu, Mar 17, 2011 at 9:46 AM, Alvaro Herrera
<alvhe...@commandprompt.com>  wrote:
Keep in mind that the datetime stuff was abandoned by the maintainer
some years ago with quite some rough edges.  Some of it has been fixed,
but a lot of bugs remain.  Looks like this is one of those places and it
seems appropriate to spend some time fixing it.  Since it would involve
a behavior change, it should only go to 9.2, of course.

I wouldn't object to fixing the problem with # of digits>  # of Ys in
9.1, if the fix is simple and clear-cut.  I think we are still
accepting patches to make minor tweaks, like the tab-completion patch
I committed yesterday.  It also doesn't bother me tremendously if we
push it off, but I don't think that anyone's going to be too sad if
TO_DATE('01-jan-2010',  'DD-MON-YYY') starts returning something more
sensible than 3010-01-01.

Agreed, it's certainly not too late for bug fixes in 9.1.  I agree
that this isn't something we would want to tweak in released branches,
but 9.1 isn't there yet.


I feel the patch for the same would be easier and was attached in the
initial mail of this mail thread. For your ready reference, I am attaching
the same patch here again.


Having said that, it's not entirely clear to me what sane behavior is
here.  Personally I would expect that an n-Ys format spec would consume
at most n digits from the input.  Otherwise how are you going to use
to_date to pick apart strings that don't have any separators?

Yeah, seems reasonable.

 So
I think the problem is actually upstream of the behavior complained of
here.  However, what we should first do is see what Oracle does in such
cases, because the main driving factor for these functions is Oracle
compatibility not what might seem sane in a vacuum.

Following is the extended chart which is comparing the behavior of Oracle,
PG&  EDBAS.

*Data Format Oracle PostgreSQL EDBAS*

TO_DATE('01-jan-1',  'DD-MON-Y') 01-JAN-2011 01-JAN-2001 01-JAN-2001
TO_DATE('01-jan-1',  'DD-MON-YY') 01-JAN-2001 01-JAN-2001 01-JAN-2001
TO_DATE('01-jan-1',  'DD-MON-YYY') 01-JAN-2001 01-JAN-2001 01-JAN-2001
TO_DATE('01-jan-1',  'DD-MON-YYYY') 01-JAN-0001 01-JAN-0001 01-JAN-0001

In this case, all the cases are in sync except the 1st one. I didn't
understand why Oracle is interpreting year '1' as '2011'.

It makes sense to me. Year "1", when dat format is "Y", means the year closest to current date that ends with 1. Or maybe the year that ends with 1 in the current decade. This is analoguous to how two-digit years are interpreted (except that we've hardcoded that the "current date" to compare against is year 2000 - an assumption that will start to bite us some time before year 2100).

*Data Format Oracle PostgreSQL EDBAS*

TO_DATE('01-jan-10',  'DD-MON-Y') Error 01-JAN-2010 Error
TO_DATE('01-jan-10',  'DD-MON-YY') 01-JAN-2010 01-JAN-2010 01-JAN-2010
TO_DATE('01-jan-10',  'DD-MON-YYY') 01-JAN-2010 01-JAN-2010 01-JAN-2010
TO_DATE('01-jan-10',  'DD-MON-YYYY') 01-JAN-0010 01-JAN-0010 01-JAN-0010

In this case, it seems in last 3 cases PostgreSQL is behaving correctly.
Oracle is throwing error in 1st case since the Format ('Y') is lesser than
the actual value ('10'). But PostgreSQL is ignoring this case and throwing
whatever is input. The output is might not be the same was user is
expecting.

*Data Format Oracle PostgreSQL EDBAS*

TO_DATE('01-jan-067',  'DD-MON-Y') Error 01-JAN-2067 Error
TO_DATE('01-jan-111',  'DD-MON-YY') 01-JAN-0111 01-JAN-2011 Error
TO_DATE('01-jan-678',  'DD-MON-YYY') 01-JAN-2678 01-JAN-1678 01-JAN-2678
TO_DATE('01-jan-001',  'DD-MON-YYYY') 01-JAN-0001 01-JAN-0001 01-JAN-0001

In this case, just last case was correct in PG. Rest other cases are not in
sync with Oracle, rather the output is vague.

*Data Format Oracle PostgreSQL EDBAS*

TO_DATE('01-jan-2010',  'DD-MON-Y') Error 01-JAN-4010 Error
TO_DATE('01-jan-2010',  'DD-MON-YY') 01-JAN-2010 01-JAN-3910 Error
TO_DATE('01-jan-2010',  'DD-MON-YYY') Error 01-JAN-3010 Error
TO_DATE('01-jan-2010',  'DD-MON-YYYY') 01-JAN-2010 01-JAN-2010 01-JAN-2010

In this case, PG is giving wrong output in first 3 cases. Those need to get
rectified. Oracle is throwing error in 1st and 3rd case and the reason is,
the format is lesser than the actual value. It seems this rule is not
applicable for 2nd case in Oracle.

Yeah, quite inconsistent :-(.

These results are not in favor of the idea that a format with n Ys always consumess up to n digits from the input. With that rule, to_date('01-jan-2010', 'DD-MON-YY') would return "01-JAN-2020", which isn't what Oracle does and seems quite surprising too.

So ignoring the cases where Oracle throws an error but PostgreSQL doesn't, there's four cases where the results differ:

> *Data Format Oracle PostgreSQL EDBAS*
> TO_DATE('01-jan-1',  'DD-MON-Y') 01-JAN-2011 01-JAN-2001 01-JAN-2001
> TO_DATE('01-jan-111',  'DD-MON-YY') 01-JAN-0111 01-JAN-2011 Error
> TO_DATE('01-jan-678',  'DD-MON-YYY') 01-JAN-2678 01-JAN-1678 01-JAN-2678
> TO_DATE('01-jan-2010',  'DD-MON-YY') 01-JAN-2010 01-JAN-3910 Error

IMHO our current behavior in 2nd and 4th case is so bizarre that we should change them to match Oracle. I think we should fix the 1st too, the notion that a single-digit year means something between 2000-2009 seems pretty useless (granted, using a single digit for year is brain-dead to begin with).

The 3rd one is debatable. The range for three-digit years is currently 1100-2099, which is enough range for many applications. But should we change it for the sake of matching Oracle's behavior? Not that anyone uses YYY in practice, but still.

BTW, whatever behavior we choose, this needs to be documented. I don't see anything in the docs on how Y, YY or YYY are expanded.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to