Hi,

I recently once more noticed that timestamptz_out is really, really
slow. To quantify that, I created a bunch of similar sized tables:

CREATE TABLE tbl_timestamp AS SELECT NOW() FROM generate_series(1, 100000) a, 
generate_series(1, 100) b;
CREATE TABLE tbl_int8 AS SELECT 1::bigint FROM generate_series(1,  100000) a, 
generate_series(1, 100) b;
CREATE TABLE tbl_bytea AS SELECT '   '::bytea FROM generate_series(1, 100000) 
a, generate_series(1, 100) b;

These all end up being 346MB large.

COPY tbl_bytea TO '/dev/null';
Time: 1173.484 ms
COPY tbl_int8 TO '/dev/null';
Time: 1030.756 ms
COPY tbl_timestamp TO '/dev/null';
Time: 6598.030

(all best of three)

Yes, timestamp outputs more data as a whole, but being 5 times as slow
is still pretty darn insane. To make sure that's not the cause, here's
another table:

CREATE TABLE tbl_timestamptext AS SELECT NOW()::text FROM generate_series(1, 
100000) a, generate_series(1, 100) b;
COPY tbl_timestamptext TO '/dev/null';
Time: 1449.554 ms

So it's really just the timestamp code.


Profiling it shows:
  Overhead  Command         Shared Object     Symbol
  -   38.33%  postgres_stock  libc-2.19.so      [.] vfprintf
     - 97.92% vfprintf
          _IO_vsprintf
        - sprintf
           + 70.25% EncodeDateTime
           + 29.75% AppendSeconds.constprop.10
     + 1.11% _IO_default_xsputn
  -    8.22%  postgres_stock  libc-2.19.so      [.] _IO_default_xsputn
     - 99.43% _IO_default_xsputn
        - 90.09% vfprintf
             _IO_vsprintf
           - sprintf
              + 74.15% EncodeDateTime
              + 25.85% AppendSeconds.constprop.10
        + 9.72% _IO_padn
     + 0.57% vfprintf
  +   7.76%  postgres_stock  postgres_stock    [.] CopyOneRowTo   

So nearly all the time is spent somewhere inside the sprintf calls. Not
nice.

The only thing I could come up to make the sprintfs cheaper was to
combine them into one and remove some of the width specifiers that
aren't needed. That doesn't buy us very much.

I then proceeded to replace the sprintf call with hand-rolled
conversions. And wow, the benefit is far bigger than I'd assumed:
postgres[7236][1]=# COPY tbl_timestamp TO '/dev/null';
Time: 2430.521 ms

So, by hand-rolling the ISO conversion in EncodeDateTime() we got a
~250% performance improvement. I'd say that's worthwhile.

The attached patch shows what I did. While there's some polishing
possible, as a whole, it's pretty ugly. But I think timestamp data is so
common that it's worth the effort.

Does anybody have a fundamentally nicer idea than the attached to
improvide this?

Greetings,

Andres Freund
>From 1d7b6110f8864ee00c1fe4f54d8937347ade7d80 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Mon, 27 Jul 2015 23:09:33 +0200
Subject: [PATCH] WIP: faster timestamp[tz]_out

---
 src/backend/utils/adt/datetime.c        | 108 ++++++++++++++++++++++++++++++++
 src/test/regress/expected/horology.out  |  24 ++++---
 src/test/regress/expected/timestamp.out |  62 +++++++++++-------
 src/test/regress/sql/timestamp.sql      |   1 +
 4 files changed, 164 insertions(+), 31 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 2a44b6e..4c13f01 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3968,7 +3968,115 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 
 	switch (style)
 	{
+#ifdef HAVE_INT64_TIMESTAMP
+		case USE_ISO_DATES:
+			/*
+			 * Fastpath for most common format and range. Not using sprintf
+			 * provides significant performance benefits, and timestamp data
+			 * is pretty common. All sane use cases dealing with large amounts
+			 * of data use iso timestamps, so we can focus on optimizing
+			 * those, and keep the rest of the code maintainable.
+			 */
+			if (tm->tm_year > 0 && tm->tm_year < 10000)
+			{
+				int year  = (tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1);
+
+				*str++ = (year / 1000) + '0';
+				*str++ = (year / 100) % 10 + '0';
+				*str++ = (year / 10) % 10 + '0';
+				*str++ = year % 10 + '0';
+				*str++ = '-';
+				*str++ = (tm->tm_mon / 10) + '0';
+				*str++ = tm->tm_mon % 10 + '0';
+				*str++ = '-';
+				*str++ = (tm->tm_mday / 10) + '0';
+				*str++ = tm->tm_mday % 10 + '0';
+				*str++ = ' ';
+				*str++ = (tm->tm_hour / 10) + '0';
+				*str++ = tm->tm_hour % 10 + '0';
+				*str++ = ':';
+				*str++ = (tm->tm_min / 10) + '0';
+				*str++ = tm->tm_min % 10 + '0';
+				*str++ = ':';
+				*str++ = (tm->tm_sec / 10) + '0';
+				*str++ = tm->tm_sec % 10 + '0';
+
+				/*
+				 * Yes, this is darned ugly and would look nicer in a loop,
+				 * but some versions of gcc can't convert the divisions into
+				 * more efficient instructions unless manually unrolled.
+				 */
+				if (fsec != 0)
+				{
+					int fseca = abs(fsec);
+
+					*str++ = '.';
+
+					if (fseca % 1000000 != 0)
+					{
+						*str++ = (fseca / 100000) + '0';
+
+						if (fseca % 100000 != 0)
+						{
+							*str++ = ((fseca / 10000) % 10) + '0';
+
+							if (fseca % 10000 != 0)
+							{
+								*str++ = ((fseca / 1000) % 10) + '0';
+
+								if (fseca % 1000 != 0)
+								{
+									*str++ = ((fseca / 100) % 10) + '0';
+
+									if (fseca % 100 != 0)
+									{
+										*str++ = ((fseca / 10) % 10) + '0';
+
+										if (fseca % 10 != 0)
+										{
+											*str++ = (fseca % 10) + '0';
+										}
+									}
+								}
+							}
+						}
+					}
+				}
+
+				if (print_tz)
+				{
+					int			hour, min, sec;
+
+					sec = abs(tz);
+					min = sec / SECS_PER_MINUTE;
+					sec -= min * SECS_PER_MINUTE;
+					hour = min / MINS_PER_HOUR;
+					min -= hour * MINS_PER_HOUR;
+
+					*str++ = (tz <= 0 ? '+' : '-');
+
+					*str++ = (hour / 10) + '0';
+					*str++ = hour % 10 + '0';
+
+					if (min != 0)
+					{
+						*str++ = (min / 10) + '0';
+						*str++ = min % 10 + '0';
+					}
+
+					if (sec != 0)
+					{
+						*str++ = (sec / 10) + '0';
+						*str++ = sec % 10 + '0';
+					}
+				}
+				*str++ = 0;
+				break;
+			}
+			/* fall through for stuff not handled in the fast path */
+#else
 		case USE_ISO_DATES:
+#endif
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
 
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index 1fe02be..67b469b 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -531,7 +531,8 @@ SELECT '' AS "64", d1 + interval '1 year' AS one_year FROM TIMESTAMP_TBL;
     | Mon Jan 01 17:32:01 2001
     | Mon Dec 31 17:32:01 2001
     | Tue Jan 01 17:32:01 2002
-(65 rows)
+    | Tue Jan 01 17:32:01 12002
+(66 rows)
 
 SELECT '' AS "64", d1 - interval '1 year' AS one_year FROM TIMESTAMP_TBL;
  64 |          one_year           
@@ -601,7 +602,8 @@ SELECT '' AS "64", d1 - interval '1 year' AS one_year FROM TIMESTAMP_TBL;
     | Fri Jan 01 17:32:01 1999
     | Fri Dec 31 17:32:01 1999
     | Sat Jan 01 17:32:01 2000
-(65 rows)
+    | Sat Jan 01 17:32:01 12000
+(66 rows)
 
 SELECT timestamp with time zone '1996-03-01' - interval '1 second' AS "Feb 29";
             Feb 29            
@@ -2290,7 +2292,8 @@ SELECT '' AS "64", d1 AS us_postgres FROM TIMESTAMP_TBL;
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(65 rows)
+    | Mon Jan 01 17:32:01 12001
+(66 rows)
 
 SELECT '' AS seven, f1 AS us_postgres FROM ABSTIME_TBL;
  seven |         us_postgres          
@@ -2373,7 +2376,8 @@ SELECT '' AS "64", d1 AS us_iso FROM TIMESTAMP_TBL;
     | 2000-01-01 17:32:01
     | 2000-12-31 17:32:01
     | 2001-01-01 17:32:01
-(65 rows)
+    | 12001-01-01 17:32:01
+(66 rows)
 
 SELECT '' AS seven, f1 AS us_iso FROM ABSTIME_TBL;
  seven |         us_iso         
@@ -2462,7 +2466,8 @@ SELECT '' AS "64", d1 AS us_sql FROM TIMESTAMP_TBL;
     | 01/01/2000 17:32:01
     | 12/31/2000 17:32:01
     | 01/01/2001 17:32:01
-(65 rows)
+    | 01/01/12001 17:32:01
+(66 rows)
 
 SELECT '' AS seven, f1 AS us_sql FROM ABSTIME_TBL;
  seven |         us_sql          
@@ -2558,8 +2563,9 @@ SELECT '' AS "65", d1 AS european_postgres FROM TIMESTAMP_TBL;
     | Sat 01 Jan 17:32:01 2000
     | Sun 31 Dec 17:32:01 2000
     | Mon 01 Jan 17:32:01 2001
+    | Mon 01 Jan 17:32:01 12001
     | Thu 13 Jun 00:00:00 1957
-(66 rows)
+(67 rows)
 
 SELECT '' AS seven, f1 AS european_postgres FROM ABSTIME_TBL;
  seven |      european_postgres       
@@ -2648,8 +2654,9 @@ SELECT '' AS "65", d1 AS european_iso FROM TIMESTAMP_TBL;
     | 2000-01-01 17:32:01
     | 2000-12-31 17:32:01
     | 2001-01-01 17:32:01
+    | 12001-01-01 17:32:01
     | 1957-06-13 00:00:00
-(66 rows)
+(67 rows)
 
 SELECT '' AS seven, f1 AS european_iso FROM ABSTIME_TBL;
  seven |      european_iso      
@@ -2738,8 +2745,9 @@ SELECT '' AS "65", d1 AS european_sql FROM TIMESTAMP_TBL;
     | 01/01/2000 17:32:01
     | 31/12/2000 17:32:01
     | 01/01/2001 17:32:01
+    | 01/01/12001 17:32:01
     | 13/06/1957 00:00:00
-(66 rows)
+(67 rows)
 
 SELECT '' AS seven, f1 AS european_sql FROM ABSTIME_TBL;
  seven |      european_sql       
diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out
index a092fc2..463484f 100644
--- a/src/test/regress/expected/timestamp.out
+++ b/src/test/regress/expected/timestamp.out
@@ -171,6 +171,7 @@ INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 1999');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2001');
+INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 12001');
 -- Currently unsupported syntax and ranges
 INSERT INTO TIMESTAMP_TBL VALUES ('Feb 16 17:32:01 -0097');
 ERROR:  time zone displacement out of range: "Feb 16 17:32:01 -0097"
@@ -248,7 +249,8 @@ SELECT '' AS "64", d1 FROM TIMESTAMP_TBL;
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(65 rows)
+    | Mon Jan 01 17:32:01 12001
+(66 rows)
 
 -- Demonstrate functions and operators
 SELECT '' AS "48", d1 FROM TIMESTAMP_TBL
@@ -304,7 +306,8 @@ SELECT '' AS "48", d1 FROM TIMESTAMP_TBL
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(49 rows)
+    | Mon Jan 01 17:32:01 12001
+(50 rows)
 
 SELECT '' AS "15", d1 FROM TIMESTAMP_TBL
    WHERE d1 < timestamp without time zone '1997-01-02';
@@ -402,7 +405,8 @@ SELECT '' AS "63", d1 FROM TIMESTAMP_TBL
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(64 rows)
+    | Mon Jan 01 17:32:01 12001
+(65 rows)
 
 SELECT '' AS "16", d1 FROM TIMESTAMP_TBL
    WHERE d1 <= timestamp without time zone '1997-01-02';
@@ -480,7 +484,8 @@ SELECT '' AS "49", d1 FROM TIMESTAMP_TBL
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(50 rows)
+    | Mon Jan 01 17:32:01 12001
+(51 rows)
 
 SELECT '' AS "54", d1 - timestamp without time zone '1997-01-02' AS diff
    FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01';
@@ -873,7 +878,8 @@ SELECT '' AS to_char_1, to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM M
            | SATURDAY  Saturday  saturday  SAT Sat sat JANUARY   January   january   I    JAN Jan jan
            | SUNDAY    Sunday    sunday    SUN Sun sun DECEMBER  December  december  XII  DEC Dec dec
            | MONDAY    Monday    monday    MON Mon mon JANUARY   January   january   I    JAN Jan jan
-(65 rows)
+           | MONDAY    Monday    monday    MON Mon mon JANUARY   January   january   I    JAN Jan jan
+(66 rows)
 
 SELECT '' AS to_char_2, to_char(d1, 'FMDAY FMDay FMday FMMONTH FMMonth FMmonth FMRM')
    FROM TIMESTAMP_TBL;
@@ -944,12 +950,13 @@ SELECT '' AS to_char_2, to_char(d1, 'FMDAY FMDay FMday FMMONTH FMMonth FMmonth F
            | SATURDAY Saturday saturday JANUARY January january I
            | SUNDAY Sunday sunday DECEMBER December december XII
            | MONDAY Monday monday JANUARY January january I
-(65 rows)
+           | MONDAY Monday monday JANUARY January january I
+(66 rows)
 
 SELECT '' AS to_char_3, to_char(d1, 'Y,YYY YYYY YYY YY Y CC Q MM WW DDD DD D J')
    FROM TIMESTAMP_TBL;
- to_char_3 |                     to_char                     
------------+-------------------------------------------------
+ to_char_3 |                      to_char                       
+-----------+----------------------------------------------------
            | 
            | 
            | 1,970 1970 970 70 0 20 1 01 01 001 01 5 2440588
@@ -1015,7 +1022,8 @@ SELECT '' AS to_char_3, to_char(d1, 'Y,YYY YYYY YYY YY Y CC Q MM WW DDD DD D J')
            | 2,000 2000 000 00 0 20 1 01 01 001 01 7 2451545
            | 2,000 2000 000 00 0 20 4 12 53 366 31 1 2451910
            | 2,001 2001 001 01 1 21 1 01 01 001 01 2 2451911
-(65 rows)
+           | 12,001 12001 001 01 1 121 1 01 01 001 01 2 6104336
+(66 rows)
 
 SELECT '' AS to_char_4, to_char(d1, 'FMY,YYY FMYYYY FMYYY FMYY FMY FMCC FMQ FMMM FMWW FMDDD FMDD FMD FMJ')
    FROM TIMESTAMP_TBL;
@@ -1086,7 +1094,8 @@ SELECT '' AS to_char_4, to_char(d1, 'FMY,YYY FMYYYY FMYYY FMYY FMY FMCC FMQ FMMM
            | 2,000 2000 0 0 0 20 1 1 1 1 1 7 2451545
            | 2,000 2000 0 0 0 20 4 12 53 366 31 1 2451910
            | 2,001 2001 1 1 1 21 1 1 1 1 1 2 2451911
-(65 rows)
+           | 12,001 12001 1 1 1 121 1 1 1 1 1 2 6104336
+(66 rows)
 
 SELECT '' AS to_char_5, to_char(d1, 'HH HH12 HH24 MI SS SSSS')
    FROM TIMESTAMP_TBL;
@@ -1157,7 +1166,8 @@ SELECT '' AS to_char_5, to_char(d1, 'HH HH12 HH24 MI SS SSSS')
            | 05 05 17 32 01 63121
            | 05 05 17 32 01 63121
            | 05 05 17 32 01 63121
-(65 rows)
+           | 05 05 17 32 01 63121
+(66 rows)
 
 SELECT '' AS to_char_6, to_char(d1, E'"HH:MI:SS is" HH:MI:SS "\\"text between quote marks\\""')
    FROM TIMESTAMP_TBL;
@@ -1228,7 +1238,8 @@ SELECT '' AS to_char_6, to_char(d1, E'"HH:MI:SS is" HH:MI:SS "\\"text between qu
            | HH:MI:SS is 05:32:01 "text between quote marks"
            | HH:MI:SS is 05:32:01 "text between quote marks"
            | HH:MI:SS is 05:32:01 "text between quote marks"
-(65 rows)
+           | HH:MI:SS is 05:32:01 "text between quote marks"
+(66 rows)
 
 SELECT '' AS to_char_7, to_char(d1, 'HH24--text--MI--text--SS')
    FROM TIMESTAMP_TBL;
@@ -1299,12 +1310,13 @@ SELECT '' AS to_char_7, to_char(d1, 'HH24--text--MI--text--SS')
            | 17--text--32--text--01
            | 17--text--32--text--01
            | 17--text--32--text--01
-(65 rows)
+           | 17--text--32--text--01
+(66 rows)
 
 SELECT '' AS to_char_8, to_char(d1, 'YYYYTH YYYYth Jth')
    FROM TIMESTAMP_TBL;
- to_char_8 |         to_char         
------------+-------------------------
+ to_char_8 |          to_char          
+-----------+---------------------------
            | 
            | 
            | 1970TH 1970th 2440588th
@@ -1370,12 +1382,13 @@ SELECT '' AS to_char_8, to_char(d1, 'YYYYTH YYYYth Jth')
            | 2000TH 2000th 2451545th
            | 2000TH 2000th 2451910th
            | 2001ST 2001st 2451911th
-(65 rows)
+           | 12001ST 12001st 6104336th
+(66 rows)
 
 SELECT '' AS to_char_9, to_char(d1, 'YYYY A.D. YYYY a.d. YYYY bc HH:MI:SS P.M. HH:MI:SS p.m. HH:MI:SS pm')
    FROM TIMESTAMP_TBL;
- to_char_9 |                               to_char                               
------------+---------------------------------------------------------------------
+ to_char_9 |                                to_char                                 
+-----------+------------------------------------------------------------------------
            | 
            | 
            | 1970 A.D. 1970 a.d. 1970 ad 12:00:00 A.M. 12:00:00 a.m. 12:00:00 am
@@ -1441,12 +1454,13 @@ SELECT '' AS to_char_9, to_char(d1, 'YYYY A.D. YYYY a.d. YYYY bc HH:MI:SS P.M. H
            | 2000 A.D. 2000 a.d. 2000 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
            | 2000 A.D. 2000 a.d. 2000 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
            | 2001 A.D. 2001 a.d. 2001 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
-(65 rows)
+           | 12001 A.D. 12001 a.d. 12001 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
+(66 rows)
 
 SELECT '' AS to_char_10, to_char(d1, 'IYYY IYY IY I IW IDDD ID')
    FROM TIMESTAMP_TBL;
- to_char_10 |        to_char         
-------------+------------------------
+ to_char_10 |         to_char         
+------------+-------------------------
             | 
             | 
             | 1970 970 70 0 01 004 4
@@ -1512,7 +1526,8 @@ SELECT '' AS to_char_10, to_char(d1, 'IYYY IYY IY I IW IDDD ID')
             | 1999 999 99 9 52 363 6
             | 2000 000 00 0 52 364 7
             | 2001 001 01 1 01 001 1
-(65 rows)
+            | 12001 001 01 1 01 001 1
+(66 rows)
 
 SELECT '' AS to_char_11, to_char(d1, 'FMIYYY FMIYY FMIY FMI FMIW FMIDDD FMID')
    FROM TIMESTAMP_TBL;
@@ -1583,7 +1598,8 @@ SELECT '' AS to_char_11, to_char(d1, 'FMIYYY FMIYY FMIY FMI FMIW FMIDDD FMID')
             | 1999 999 99 9 52 363 6
             | 2000 0 0 0 52 364 7
             | 2001 1 1 1 1 1 1
-(65 rows)
+            | 12001 1 1 1 1 1 1
+(66 rows)
 
 -- timestamp numeric fields constructor
 SELECT make_timestamp(2014,12,28,6,30,45.887);
diff --git a/src/test/regress/sql/timestamp.sql b/src/test/regress/sql/timestamp.sql
index b22cd48..b548c70 100644
--- a/src/test/regress/sql/timestamp.sql
+++ b/src/test/regress/sql/timestamp.sql
@@ -136,6 +136,7 @@ INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 1999');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2001');
+INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 12001');
 
 -- Currently unsupported syntax and ranges
 INSERT INTO TIMESTAMP_TBL VALUES ('Feb 16 17:32:01 -0097');
-- 
2.4.0.rc2.1.g3d6bc9a

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to