Hi,

I was randomly testing some date related stuff on PG & observed that the
outputs were wrong.

e.g.
postgres=# SELECT TO_DATE('01-jan-2010',  'DD-MON-YY');
  to_date
------------
 3910-01-01  <--------- Look at this
(1 row)

postgres=# SELECT TO_DATE('01-jan-2010',  'DD-MON-YYYY');
  to_date
------------
 2010-01-01
(1 row)

User can provide the year in any format i.e. 1-digit to 4-digit, in this
case the internal representation of century should be logically correct.

Considering this situation, I am drawing the table below & logging my
observation here. This might help us to demonstrate where we are lacking in
PG.

*Data Format                 PostgreSQL EDBAS*
*TO_DATE('01-jan-1',  'DD-MON-Y')         2001-01-01 01-JAN-2001*
*TO_DATE('01-jan-1',  'DD-MON-YY')         2001-01-01 01-JAN-2001*
*TO_DATE('01-jan-1',  'DD-MON-YYY')         2001-01-01 01-JAN-2001*
*TO_DATE('01-jan-1',  'DD-MON-YYYY') 0001-01-01 01-JAN-0001*

In this case, all the cases seems correct. Also the YEAR part in the output
is seems logical.

*Data Format                 PostgreSQL EDBAS*
*TO_DATE('01-jan-10',  'DD-MON-Y')         2010-01-01 Error *
*TO_DATE('01-jan-10',  'DD-MON-YY')         2010-01-01 01-JAN-2010*
*TO_DATE('01-jan-10',  'DD-MON-YYY') 2010-01-01 01-JAN-2010*
*TO_DATE('01-jan-10',  'DD-MON-YYYY') 0010-01-01 01-JAN-0010*

In this case, it seems in last 3 cases PG is behaving correctly. Whereas in
1st case the output is not correct since the Format ('Y') is lesser than the
actual input ('10'). But PG is ignoring this condition and throwing whatever
is input. The output year is might not be the year, what user is expecting.
Hence PG should throw an error.

*Data Format                 PostgreSQL EDBAS*
*TO_DATE('01-jan-067',  'DD-MON-Y')         2067-01-01 Error*
*TO_DATE('01-jan-111',  'DD-MON-YY')         2011-01-01 Error*
*TO_DATE('01-jan-678',  'DD-MON-YYY') 1678-01-01 01-JAN-2678*
*TO_DATE('01-jan-001',  'DD-MON-YYYY') 0001-01-01 01-JAN-0001*

In this case, only last case seems correct in PG. Rest other cases are might
not be logical, rather the output is vague. In PG, I haven't seen any
document which is saying something like this, if year is 111...999 then the
century would be 2000 and 001...099 then then century would be 1000.
However, the 1st and 2nd case should throw an error since the output format
('Y' & 'YY') are really not matching with the Input ('067' & '111'),
respectively.

*Data Format                 PostgreSQL EDBAS*
*TO_DATE('01-jan-2010',  'DD-MON-Y')         4010-01-01 Error*
*TO_DATE('01-jan-2010',  'DD-MON-YY') 3910-01-01 Error *
*TO_DATE('01-jan-2010',  'DD-MON-YYY') 3010-01-01 Error *
*TO_DATE('01-jan-2010',  'DD-MON-YYYY') 2010-01-01 01-JAN-2010*

In this case, PG is giving wrong output in first 3 cases. Those need to get
rectified. Again it should throw error in these cases, because Output Format
is not matching with Input Data. The exception here is the 2nd case, where
century is well-defined.

After observing the all above cases, the summary would be, if the output
format is lesser than the actual input value, then it should throw an error.
Considering this thumb rule, we can fix the wrong outputs in PG. I have made
necessary changes to the code & attaching the patch with this email.

In the patch, I have written one centralize function which will decide the
century depends upon the given Input format.

Thoughts ? Any other ideas on this ?

-- 
Piyush S Newe
Principal Engineer
EnterpriseDB
office: +91 20 3058 9500
www.enterprisedb.com

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are not
the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index aba1145..ad42126 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -989,6 +989,7 @@ static DCHCacheEntry *DCH_cache_getnew(char *str);
 static NUMCacheEntry *NUM_cache_search(char *str);
 static NUMCacheEntry *NUM_cache_getnew(char *str);
 static void NUM_cache_remove(NUMCacheEntry *ent);
+static int DecideCentury(int *in, int len);
 
 
 /* ----------
@@ -2733,21 +2734,14 @@ DCH_from_char(FormatNode *node, char *in, TmFromChar *out)
 			case DCH_IYYY:
 				from_char_parse_int(&out->year, &s, n);
 				out->yysz = 4;
+				DecideCentury(&out->year, out->yysz);
 				s += SKIP_THth(n->suffix);
 				break;
 			case DCH_YYY:
 			case DCH_IYY:
 				from_char_parse_int(&out->year, &s, n);
 				out->yysz = 3;
-
-				/*
-				 * 3-digit year: '100' ... '999' = 1100 ... 1999 '000' ...
-				 * '099' = 2000 ... 2099
-				 */
-				if (out->year >= 100)
-					out->year += 1000;
-				else
-					out->year += 2000;
+				DecideCentury(&out->year, out->yysz);
 				s += SKIP_THth(n->suffix);
 				break;
 			case DCH_YY:
@@ -2755,14 +2749,7 @@ DCH_from_char(FormatNode *node, char *in, TmFromChar *out)
 				from_char_parse_int(&out->year, &s, n);
 				out->yysz = 2;
 
-				/*
-				 * 2-digit year: '00' ... '69'	= 2000 ... 2069 '70' ... '99'
-				 * = 1970 ... 1999
-				 */
-				if (out->year < 70)
-					out->year += 2000;
-				else
-					out->year += 1900;
+				DecideCentury(&out->year, out->yysz);
 				s += SKIP_THth(n->suffix);
 				break;
 			case DCH_Y:
@@ -2770,10 +2757,7 @@ DCH_from_char(FormatNode *node, char *in, TmFromChar *out)
 				from_char_parse_int(&out->year, &s, n);
 				out->yysz = 1;
 
-				/*
-				 * 1-digit year: always +2000
-				 */
-				out->year += 2000;
+				DecideCentury(&out->year, out->yysz);
 				s += SKIP_THth(n->suffix);
 				break;
 			case DCH_RM:
@@ -5161,3 +5145,63 @@ float8_to_char(PG_FUNCTION_ARGS)
 	NUM_TOCHAR_finish;
 	PG_RETURN_TEXT_P(result);
 }
+
+/*
+ * In case of different Inputs & different Year formats, engine itself
+ * has to decide the century for the given year. This function will take
+ * care of the input & accordingly append the century their if not provided.
+ *
+ *   Input 	=	Transformed output
+ * 0 ... 9 		= 2000 ... 2009
+ * 00 ... 69 	= 2000 ... 2069
+ * 70 ... 99 	= 1900 ... 1999
+ * 000 ... 999 	= 2000 ... 2999
+ *
+ * In case input is greater than 9999 it throws an error. The basic rule of
+ * the function is " If the format is lesser than the actual value, then it
+ * should throw an error. Only in case of 'YY', we need not throw an error,
+ * unless the year is later than 9999".
+ */
+
+static int DecideCentury(int *in, int len)
+{
+	if (len == 1)
+	{
+		if (*in > 0 && *in < 9)
+			*in += 2000;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
+					 errmsg("date format picture ends before converting entire input string %d ", len)));
+	}
+	else if (len == 2)
+	{
+		if (*in > 0 && *in < 69)
+			*in += 2000;
+		else if (*in > 70 && *in < 99)
+			*in += 1900;
+		else if (*in > 9999)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
+					 errmsg("date format picture ends before converting entire input string %d ", len)));
+
+	}
+	else if (len == 3)
+	{
+		if (*in > 0 && *in < 999)
+			*in += 2000;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
+					 errmsg("date format picture ends before converting entire input string %d ", len)));
+	}
+	else if (len == 4)
+	{
+		if (*in > 9999)
+			ereport(ERROR,
+				(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
+				 errmsg("date format picture ends before converting entire input string %d ", len)));
+	}
+
+	return 0;
+}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to