Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-23 Thread Peter Eisentraut
On sön, 2009-11-22 at 00:23 -0500, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  Attached is an updated patch with a couple of tweaks to ensure output
  is formatted and spaced correctly when border=0, which was off in the
  last patch.
 
 Applied wih minor editorialization.  Notably, I renamed the
 backwards-compatible option from ascii-old to old-ascii,
 because the original submission failed to preserve the documented
 behavior that the options could be abbreviated to one letter.

What is the plan behind keeping the old format?  Are we going to remove
it after one release if no one complains, or are we seriously expecting
that someone has code that actually parses this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-23 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 What is the plan behind keeping the old format?  Are we going to remove
 it after one release if no one complains, or are we seriously expecting
 that someone has code that actually parses this?

Plan?  Do we need a plan?  The extra support consists of about two lines
of code and a small table, so there wouldn't be much harm in leaving
it there forever.

On the other hand it seems pretty likely that no one would care after a
release or two.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-21 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 Attached is an updated patch with a couple of tweaks to ensure output
 is formatted and spaced correctly when border=0, which was off in the
 last patch.

Applied wih minor editorialization.  Notably, I renamed the
backwards-compatible option from ascii-old to old-ascii,
because the original submission failed to preserve the documented
behavior that the options could be abbreviated to one letter.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-15 Thread Roger Leigh
On Sun, Nov 15, 2009 at 12:50:14AM +, Roger Leigh wrote:
 On Sat, Nov 14, 2009 at 01:31:29PM -0500, Tom Lane wrote:
  Roger Leigh rle...@codelibre.net writes:
   The side effect from this change is that some of the testsuite
   expected data will need updating due to the extra pad spaces
  
  No, we are *not* doing that.  Somebody made a change to the print.c
  logic last year that started adding harmless white space to the
  last column, and it was a complete disaster for tracking whether
  anything important had changed in regression test output.  Please
  undo that part of your patch.
 
 No problem, done as requested.  I've attached an updated patch that
 takes care to exactly match the trailing whitespace the existing
 psql outputs.  This fixes most of the changes between observed and
 expected test results.

Attached is an updated patch with a couple of tweaks to ensure output
is formatted and spaced correctly when border=0, which was off in the
last patch.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7f03802..4b3fe71 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1765,18 +1765,40 @@ lo_import 152801
   listitem
   para
   Sets the border line drawing style to one
-  of literalascii/literal or literalunicode/literal.
-  Unique abbreviations are allowed.  (That would mean one
-  letter is enough.)
+  of literalascii/literal, literalascii-old/literal
+  or literalunicode/literal.  Unique abbreviations are
+  allowed.  (That would mean one letter is enough.)
   /para
 
   para
-  quoteASCII/quote uses plain acronymASCII/acronym characters.
+  quoteASCII/quote uses plain acronymASCII/acronym
+  characters.  Newlines in data are shown using
+  a literal+/literal symbol in the right-hand margin,
+  while wrapped data uses a literal./literal symbol in the
+  right-hand margin of a wrapped line, and in the left-hand
+  margin of the following continuation line.
   /para
 
   para
+  quoteASCII-old/quote uses plain acronymASCII/acronym
+  characters, using the formatting style used
+  for productnamePostgreSQL/productname 8.4 and earlier.
+  Newlines in data are shown using a literal:/literal
+  symbol in place of the left-hand column separator, while
+  wrapped data uses a literal;/literal symbol.  Newlines
+  in column headings are indicated by a literal+/literal
+  symbol in the left-hand margin of additional lines.
+	  /para
+
+  para
   quoteUnicode/quote uses Unicode box-drawing characters.
-  /para
+	  Newlines in data are shown using a carriage return symbol
+	  (literal#8629;/literal) in the right-hand margin.
+	  Wrapped data uses an ellipsis symbol
+	  (literal#8230;/literal) in the right-hand margin of a
+	  wrapped line, and in the left-hand margin of the following
+	  continuation line.
+	  /para
 
   para
   When the selected output format is one that draws lines or boxes
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 190a8d3..544a677 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1795,11 +1795,13 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			;
 		else if (pg_strncasecmp(ascii, value, vallen) == 0)
 			popt-topt.line_style = pg_asciiformat;
+		else if (pg_strncasecmp(ascii-old, value, vallen) == 0)
+			popt-topt.line_style = pg_asciiformat_old;
 		else if (pg_strncasecmp(unicode, value, vallen) == 0)
 			popt-topt.line_style = pg_utf8format;
 		else
 		{
-			psql_error(\\pset: allowed line styles are ascii, unicode\n);
+			psql_error(\\pset: allowed line styles are ascii, ascii-old, unicode\n);
 			return false;
 		}
 
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 026e043..5d1c8d4 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -45,9 +45,9 @@ static char *grouping;
 static char *thousands_sep;
 
 /* Line style control structures */
-const printTextFormat pg_asciiformat =
+const printTextFormat pg_asciiformat_old =
 {
-	ascii,
+	ascii-old,
 	{
 		{ -, +, +, + },
 		{ -, +, +, + },
@@ -56,7 +56,36 @@ const printTextFormat pg_asciiformat =
 	},
 	:,
 	;,
-	 
+	 ,
+	+,
+	 ,
+	 ,
+	 ,
+	 ,
+	 ,
+	false
+};
+
+/* Line style control structures */
+const printTextFormat pg_asciiformat =
+{
+	ascii,
+	{
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ ,  |, |, | }
+	},
+	|,
+	|,
+	|,
+	 ,
+	+,
+	 ,
+	+,
+	.,
+	.,
+	true
 };
 
 const printTextFormat pg_utf8format =
@@ -72,12 +101,23 @@ const 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Roger Leigh
On Mon, Nov 09, 2009 at 05:40:54PM -0500, Bruce Momjian wrote:
 Tom Lane wrote:
  Greg Stark gsst...@mit.edu writes:
   While i agree this looks nicer I wonder what it does to things like
   excel/gnumeric/ooffice auto-recognizing table layouts and importing
   files. I'm not sure our old format was so great for this so maybe this
   is actually an improvement I'm asking for.
  
  Yeah.  We can do what we like with the UTF8 format but I'm considerably
  more worried about the aspect of making random changes to the
  plain-ASCII output.  On the other hand, we changed that just a release
  or so ago (to put in the multiline output in the first place) and
  I didn't hear complaints about it that time.
 
 Sorry for the delayed reply:
 
 The line continuation characters were chosen in 8.4 for clarity --- if
 you have found something clearer for 8.5, let's make the improvement.  I
 think clarity is more important in this area than consistency with the
 previous psql output format.

The attached patch is proposed for the upcoming commitfest, and
hopefully takes into account the comments made in this thread.
To summarise the changes:

- it makes the handling of newlines and wrapped lines consistent
  between column header and data lines.
- it includes additional logic such that both the old and new
  styles are representable using the format structures, so we
  can retain backward compatibility if so desired (it's easy to
  remove if not).
- an ascii-old linestyle is added which is identical to the old
  style for those who need guaranteed reproducibility of output,
  but this is not the default.
- The Unicode format uses ↵ in the right-hand margin to indicate
  newlines.  Wrapped lines use … in the right-hand margin before,
  and left-hand margin after, a break (so you can visually follow
  the wrap).
- The ASCII format is the same but uses + and . in place of
  carriage return and ellipsis symbols.
- All the above is documented in the SGML documentation, including
  the old style, which I always found confusing.

For comparison, I've included a transcript of all three linestyles
with a test script (also attached).

Any changes to the symbols used and/or their placement are trivially
made by just altering the format in print.c.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7f03802..4600303 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1765,18 +1765,40 @@ lo_import 152801
   listitem
   para
   Sets the border line drawing style to one
-  of literalascii/literal or literalunicode/literal.
-  Unique abbreviations are allowed.  (That would mean one
-  letter is enough.)
+  of literalascii/literal, literalascii-old/literal
+  or literalunicode/literal.  Unique abbreviations are
+  allowed.  (That would mean one letter is enough.)
   /para
 
   para
-  quoteASCII/quote uses plain acronymASCII/acronym characters.
+  quoteASCII/quote uses plain acronymASCII/acronym
+  characters.  Newlines in data are shown using
+  a literal+/literal symbol in the right-hand margin.
+  Wrapped data uses a literal./literal symbol in the
+  right-hand margin of a wrapped line, and in the left-hand
+  margin of the following continuation line.
   /para
 
   para
+  quoteASCII-old/quote uses plain acronymASCII/acronym
+  characters, using the formatting style used
+  for productnamePostgreSQL/productname 8.4 and earlier.
+  Newlines in data are shown using a literal:/literal
+  symbol in place of the left-hand column separator, while
+  wrapped data uses a literal;/literal symbol.  Newlines
+  in column headings are indicated by a literal+/literal
+  symbol in the left-hand margin of additional lines.
+	  /para
+
+  para
   quoteUnicode/quote uses Unicode box-drawing characters.
-  /para
+	  Newlines in data are shown using a carriage return symbol
+	  (literal#8629;/literal) in the right-hand margin.
+	  Wrapped data uses an ellipsis symbol
+	  (literal#8230;/literal) in the right-hand margin of a
+	  wrapped line, and in the left-hand margin of the following
+	  continuation line.
+	  /para
 
   para
   When the selected output format is one that draws lines or boxes
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 190a8d3..544a677 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1795,11 +1795,13 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			;
 		else if (pg_strncasecmp(ascii, 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Roger Leigh
On Sat, Nov 14, 2009 at 05:40:24PM +, Roger Leigh wrote:
 On Mon, Nov 09, 2009 at 05:40:54PM -0500, Bruce Momjian wrote:
  Tom Lane wrote:
   Greg Stark gsst...@mit.edu writes:
While i agree this looks nicer I wonder what it does to things like
excel/gnumeric/ooffice auto-recognizing table layouts and importing
files. I'm not sure our old format was so great for this so maybe this
is actually an improvement I'm asking for.
   
   Yeah.  We can do what we like with the UTF8 format but I'm considerably
   more worried about the aspect of making random changes to the
   plain-ASCII output.  On the other hand, we changed that just a release
   or so ago (to put in the multiline output in the first place) and
   I didn't hear complaints about it that time.
  
  Sorry for the delayed reply:
  
  The line continuation characters were chosen in 8.4 for clarity --- if
  you have found something clearer for 8.5, let's make the improvement.  I
  think clarity is more important in this area than consistency with the
  previous psql output format.
 
 The attached patch is proposed for the upcoming commitfest, and
 hopefully takes into account the comments made in this thread.

One note: because it's not possible to know in advance (without
making things much more complex) whether or not a line will wrap
or continue, the code currently makes sure we fully pad output
up to the right margin of the table (finalspaces).  This is in
some cases unnecessary, but is needed when border=1 or else the
continuation/wrap symbols don't end up in the margin and will
look like they are part of the data.

The side effect from this change is that some of the testsuite
expected data will need updating due to the extra pad spaces
now added to the output (triggers, dependency, tsearch,
foreign_data, prepare, with).  If the actual output format is
OK with people then I'll update the patch to include the test
data updates as well.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 The side effect from this change is that some of the testsuite
 expected data will need updating due to the extra pad spaces

No, we are *not* doing that.  Somebody made a change to the print.c
logic last year that started adding harmless white space to the
last column, and it was a complete disaster for tracking whether
anything important had changed in regression test output.  Please
undo that part of your patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-14 Thread Roger Leigh
On Sat, Nov 14, 2009 at 01:31:29PM -0500, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  The side effect from this change is that some of the testsuite
  expected data will need updating due to the extra pad spaces
 
 No, we are *not* doing that.  Somebody made a change to the print.c
 logic last year that started adding harmless white space to the
 last column, and it was a complete disaster for tracking whether
 anything important had changed in regression test output.  Please
 undo that part of your patch.

No problem, done as requested.  I've attached an updated patch that
takes care to exactly match the trailing whitespace the existing
psql outputs.  This fixes most of the changes between observed and
expected test results.

Due to the fact that this patch does alter the output for newlines
and wrapped lines (being its intent), the patch does alter expected
testcase output for these specific cases.  Because the old
wrap/newline code put : and ; in place of | between columns,
this meant that it never worked for the first column of data, which
included single column result sets.  This necessitated some changes
to the expected results to reflect this change (which now makes the
output uniform for all columns, irrespective of position).


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7f03802..4b3fe71 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1765,18 +1765,40 @@ lo_import 152801
   listitem
   para
   Sets the border line drawing style to one
-  of literalascii/literal or literalunicode/literal.
-  Unique abbreviations are allowed.  (That would mean one
-  letter is enough.)
+  of literalascii/literal, literalascii-old/literal
+  or literalunicode/literal.  Unique abbreviations are
+  allowed.  (That would mean one letter is enough.)
   /para
 
   para
-  quoteASCII/quote uses plain acronymASCII/acronym characters.
+  quoteASCII/quote uses plain acronymASCII/acronym
+  characters.  Newlines in data are shown using
+  a literal+/literal symbol in the right-hand margin,
+  while wrapped data uses a literal./literal symbol in the
+  right-hand margin of a wrapped line, and in the left-hand
+  margin of the following continuation line.
   /para
 
   para
+  quoteASCII-old/quote uses plain acronymASCII/acronym
+  characters, using the formatting style used
+  for productnamePostgreSQL/productname 8.4 and earlier.
+  Newlines in data are shown using a literal:/literal
+  symbol in place of the left-hand column separator, while
+  wrapped data uses a literal;/literal symbol.  Newlines
+  in column headings are indicated by a literal+/literal
+  symbol in the left-hand margin of additional lines.
+	  /para
+
+  para
   quoteUnicode/quote uses Unicode box-drawing characters.
-  /para
+	  Newlines in data are shown using a carriage return symbol
+	  (literal#8629;/literal) in the right-hand margin.
+	  Wrapped data uses an ellipsis symbol
+	  (literal#8230;/literal) in the right-hand margin of a
+	  wrapped line, and in the left-hand margin of the following
+	  continuation line.
+	  /para
 
   para
   When the selected output format is one that draws lines or boxes
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 190a8d3..544a677 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1795,11 +1795,13 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			;
 		else if (pg_strncasecmp(ascii, value, vallen) == 0)
 			popt-topt.line_style = pg_asciiformat;
+		else if (pg_strncasecmp(ascii-old, value, vallen) == 0)
+			popt-topt.line_style = pg_asciiformat_old;
 		else if (pg_strncasecmp(unicode, value, vallen) == 0)
 			popt-topt.line_style = pg_utf8format;
 		else
 		{
-			psql_error(\\pset: allowed line styles are ascii, unicode\n);
+			psql_error(\\pset: allowed line styles are ascii, ascii-old, unicode\n);
 			return false;
 		}
 
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 026e043..d382458 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -45,9 +45,9 @@ static char *grouping;
 static char *thousands_sep;
 
 /* Line style control structures */
-const printTextFormat pg_asciiformat =
+const printTextFormat pg_asciiformat_old =
 {
-	ascii,
+	ascii-old,
 	{
 		{ -, +, +, + },
 		{ -, +, +, + },
@@ -56,7 +56,36 @@ const printTextFormat pg_asciiformat =
 	},
 	:,
 	;,
-	 
+	 ,
+	+,
+	 ,
+	 ,
+	 ,
+	 ,
+	 ,
+	false
+};
+
+/* Line 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-09 Thread Bruce Momjian
Tom Lane wrote:
 Greg Stark gsst...@mit.edu writes:
  While i agree this looks nicer I wonder what it does to things like
  excel/gnumeric/ooffice auto-recognizing table layouts and importing
  files. I'm not sure our old format was so great for this so maybe this
  is actually an improvement I'm asking for.
 
 Yeah.  We can do what we like with the UTF8 format but I'm considerably
 more worried about the aspect of making random changes to the
 plain-ASCII output.  On the other hand, we changed that just a release
 or so ago (to put in the multiline output in the first place) and
 I didn't hear complaints about it that time.

Sorry for the delayed reply:

The line continuation characters were chosen in 8.4 for clarity --- if
you have found something clearer for 8.5, let's make the improvement.  I
think clarity is more important in this area than consistency with the
previous psql output format.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-11-09 Thread Alvaro Herrera
Andrew Dunstan wrote:

 You can set the field separator to ',' but you can't do a
   \pset format csv
 and get CSV with correct quoting, escaping etc AFAICS.  It'll
 still break on line wrapping if wrapping is enabled, and with
 newlines in the data.
 
 If that would be a useful addition, I can add it.
 
 It's done by the backend, not by psql, but it has psql support - see \copy

I agree with Roger that we should really have a CSV option in \pset format.
COPY as CVS is certainly very useful, but it's a different use case.
(You can't \copy a \-command for example).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Roger Leigh
On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
 Greg Stark gsst...@mit.edu writes:
  While i agree this looks nicer I wonder what it does to things like
  excel/gnumeric/ooffice auto-recognizing table layouts and importing
  files. I'm not sure our old format was so great for this so maybe this
  is actually an improvement I'm asking for.
 
 Yeah.  We can do what we like with the UTF8 format but I'm considerably
 more worried about the aspect of making random changes to the
 plain-ASCII output.  On the other hand, we changed that just a release
 or so ago (to put in the multiline output in the first place) and
 I didn't hear complaints about it that time.

The attached updated patch makes sure that the ASCII display remains
the same (bar trailing whitespace used for padding).

I'm tempted to add an additional ascii format such as ascii-clean
which cleans up the inconsistencies in the formatting as for the
unicode format, while the existing ascii format would remain the
default for backwards compatibility.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 026e043..3f61293 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -56,7 +56,14 @@ const printTextFormat pg_asciiformat =
 	},
 	:,
 	;,
-	 
+	 ,
+	+,
+	 ,
+	 ,
+	 ,
+	 ,
+	 ,
+	false
 };
 
 const printTextFormat pg_utf8format =
@@ -72,12 +79,23 @@ const printTextFormat pg_utf8format =
 		/* N/A, │, │, │ */
 		{ , \342\224\202, \342\224\202, \342\224\202 }
 	},
-	/* ╎ */
-	\342\225\216,
-	/* ┊ */
-	\342\224\212,
-	/* ╷ */
-	\342\225\267
+	/* │ */
+	\342\224\202,
+	/* │ */
+	\342\224\202,
+	/* │ */
+	\342\224\202,
+	 ,
+	/* ↵ */
+	\342\206\265,
+	 ,
+	/* ↵ */
+	\342\206\265,
+	/* … */
+	\342\200\246,
+	/* … */
+	\342\200\246,
+	true
 };
 
 
@@ -479,6 +497,8 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	int			output_columns = 0;		/* Width of interactive console */
 	bool		is_pager = false;
 
+	printTextLineWrap *wrap;
+
 	if (cancel_pressed)
 		return;
 
@@ -499,6 +519,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		format_buf = pg_local_calloc(col_count, sizeof(*format_buf));
 		header_done = pg_local_calloc(col_count, sizeof(*header_done));
 		bytes_output = pg_local_calloc(col_count, sizeof(*bytes_output));
+		wrap = pg_local_calloc(col_count, sizeof(*wrap));
 	}
 	else
 	{
@@ -513,6 +534,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		format_buf = NULL;
 		header_done = NULL;
 		bytes_output = NULL;
+		wrap = NULL;
 	}
 
 	/* scan all column headers, find maximum width and max max_nl_lines */
@@ -575,7 +597,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 
 	/* adjust the total display width based on border style */
 	if (opt_border == 0)
-		width_total = col_count - 1;
+		width_total = col_count;
 	else if (opt_border == 1)
 		width_total = col_count * 3 - 1;
 	else
@@ -770,16 +792,16 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			while (more_col_wrapping)
 			{
 if (opt_border == 2)
-	fprintf(fout, %s%c, dformat-leftvrule,
-			curr_nl_line ? '+' : ' ');
-else if (opt_border == 1)
-	fputc(curr_nl_line ? '+' : ' ', fout);
+	fputs(dformat-leftvrule, fout);
 
 for (i = 0; i  cont-ncolumns; i++)
 {
 	struct lineptr *this_line = col_lineptrs[i] + curr_nl_line;
 	unsigned int nbspace;
 
+	if (opt_border != 0 || (format-cont_right_border == false  i  0))
+		fputs(curr_nl_line ? format-header_cont_left :  , fout);
+
 	if (!header_done[i])
 	{
 		nbspace = width_wrap[i] - this_line-width;
@@ -796,21 +818,17 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	}
 	else
 		fprintf(fout, %*s, width_wrap[i], );
-	if (i  col_count - 1)
-	{
-		if (opt_border == 0)
-			fputc(curr_nl_line ? '+' : ' ', fout);
-		else
-			fprintf(fout,  %s%c, dformat-midvrule,
-	curr_nl_line ? '+' : ' ');
-	}
+
+	if (opt_border != 0 || format-cont_right_border == true)
+		fputs(!header_done[i] ? format-header_cont_right :  , fout);
+
+	if (opt_border != 0  i  col_count - 1)
+		fputs(dformat-midvrule, fout);
 }
 curr_nl_line++;
 
 if (opt_border == 2)
-	fprintf(fout,  %s, dformat-rightvrule);
-else if (opt_border == 1)
-	fputc(' ', fout);
+	fputs(dformat-rightvrule, fout);
 fputc('\n', fout);
 			}
 
@@ -861,19 +879,28 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 
 			/* left border */
 			if (opt_border == 2)
-fprintf(fout, %s , dformat-leftvrule);
-			else if (opt_border == 1)
-fputc(' ', fout);
+fputs(dformat-leftvrule, fout);
 
 			/* for each column */
 			for (j = 0; j  col_count; j++)
 			{
 		

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Greg Stark
On Mon, Oct 26, 2009 at 11:43 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Mon, 2009-10-26 at 10:12 -0700, Greg Stark wrote:
 While i agree this looks nicer I wonder what it does to things like
 excel/gnumeric/ooffice auto-recognizing table layouts and importing
 files. I'm not sure our old format was so great for this so maybe this
 is actually an improvement I'm asking for. But as long as we're
 changing the format... It would at at least be good to test the
 behaviour

 What exactly are you referring to here?

run something like this:

$ psql
stark= \o /tmp/s
stark= select generate_series(1,10),generate_series(1,5);
$ gnumeric /tmp/s
$ ooffice /tmp/s
$ kspread /tmp/s

With the 8.4 formatting gnumeric automatically guesses that | is the
separator and formats the speadsheet quite reasonably. Open Office
gets confused and opens the word processor, but if you do insert
sheet from file and manually deselect the space and semicolon
delimiters and put | as an other delimiter then it looks like it
should work. I don't have kspread handy.

Does gnumeric still autorecognize the new formats? Do the newline
indicators in 8.4 mess up gnumeric? Are the new ones better or worse?

This hasn't been a top priority in the past and the ReST discussion
seemed to end up concluding that we shouldn't bother if we can't make
it perfect. I'm not sure I agree with that, but in any case I think as
long as we're changing the format we may as well check to see what the
status is.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Roger Leigh
On Sat, Oct 31, 2009 at 05:11:10AM -0700, Greg Stark wrote:
 On Mon, Oct 26, 2009 at 11:43 PM, Peter Eisentraut pete...@gmx.net wrote:
  On Mon, 2009-10-26 at 10:12 -0700, Greg Stark wrote:
  While i agree this looks nicer I wonder what it does to things like
  excel/gnumeric/ooffice auto-recognizing table layouts and importing
  files. I'm not sure our old format was so great for this so maybe this
  is actually an improvement I'm asking for. But as long as we're
  changing the format... It would at at least be good to test the
  behaviour
 
  What exactly are you referring to here?
 
 run something like this:
 
 $ psql
 stark= \o /tmp/s
 stark= select generate_series(1,10),generate_series(1,5);
 $ gnumeric /tmp/s
 $ ooffice /tmp/s
 $ kspread /tmp/s
 
 With the 8.4 formatting gnumeric automatically guesses that | is the
 separator and formats the speadsheet quite reasonably. Open Office
 gets confused and opens the word processor, but if you do insert
 sheet from file and manually deselect the space and semicolon
 delimiters and put | as an other delimiter then it looks like it
 should work. I don't have kspread handy.
 
 Does gnumeric still autorecognize the new formats? Do the newline
 indicators in 8.4 mess up gnumeric? Are the new ones better or worse?
 
 This hasn't been a top priority in the past and the ReST discussion
 seemed to end up concluding that we shouldn't bother if we can't make
 it perfect. I'm not sure I agree with that, but in any case I think as
 long as we're changing the format we may as well check to see what the
 status is.

Surely if people want a machine-readable output format, they should
either

1) use libpq or one of its bindings, or
2) use a dedicated machine-readable output format such as CSV, which
   is /designed/ for spreadsheet import.

The standard psql output formats (aligned, unaligned) are for
human-readable output and the others (latex, html, troff-ms) are
marked up for the respective tools.  None of these are really
useful for other programs to parse.

Wouldn't it be much simpler all around to add a csv output format
in addition to the above for this purpose?  Spreadsheets can read
it in with no trouble at all.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Andrew Dunstan



Roger Leigh wrote:


Wouldn't it be much simpler all around to add a csv output format
in addition to the above for this purpose?  Spreadsheets can read
it in with no trouble at all.



  


We've had CSV output since version 8.0.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Roger Leigh
On Sat, Oct 31, 2009 at 12:25:22PM -0400, Andrew Dunstan wrote:


 Roger Leigh wrote:

 Wouldn't it be much simpler all around to add a csv output format
 in addition to the above for this purpose?  Spreadsheets can read
 it in with no trouble at all.

 We've had CSV output since version 8.0.

Really?  The only references I see are in tab-complete.c relating to
COPY.

You can set the field separator to ',' but you can't do a
  \pset format csv
and get CSV with correct quoting, escaping etc AFAICS.  It'll
still break on line wrapping if wrapping is enabled, and with
newlines in the data.

If that would be a useful addition, I can add it.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-31 Thread Andrew Dunstan



Roger Leigh wrote:

On Sat, Oct 31, 2009 at 12:25:22PM -0400, Andrew Dunstan wrote:
  

Roger Leigh wrote:


Wouldn't it be much simpler all around to add a csv output format
in addition to the above for this purpose?  Spreadsheets can read
it in with no trouble at all.
  

We've had CSV output since version 8.0.



Really?  The only references I see are in tab-complete.c relating to
COPY.

You can set the field separator to ',' but you can't do a
  \pset format csv
and get CSV with correct quoting, escaping etc AFAICS.  It'll
still break on line wrapping if wrapping is enabled, and with
newlines in the data.

If that would be a useful addition, I can add it.



  


It's done by the backend, not by psql, but it has psql support - see \copy

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-27 Thread Peter Eisentraut
On Mon, 2009-10-26 at 10:12 -0700, Greg Stark wrote:
 While i agree this looks nicer I wonder what it does to things like
 excel/gnumeric/ooffice auto-recognizing table layouts and importing
 files. I'm not sure our old format was so great for this so maybe this
 is actually an improvement I'm asking for. But as long as we're
 changing the format... It would at at least be good to test the
 behaviour

What exactly are you referring to here?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-27 Thread Roger Leigh
On Mon, Oct 26, 2009 at 11:33:40PM +, Roger Leigh wrote:
 On Mon, Oct 26, 2009 at 07:19:24PM -0400, Tom Lane wrote:
  Roger Leigh rle...@codelibre.net writes:
   On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
   Yeah.  We can do what we like with the UTF8 format but I'm considerably
   more worried about the aspect of making random changes to the
   plain-ASCII output.
  
   I checked (using strace)
   gnumeric (via libgda and gnome-database-properties)
   openoffice (oobase)
  
  Even if that were the entire universe of programs we cared about,
  whether their internal ODBC logic goes through psql isn't really
  the point here.  What I'm worried about is somebody piping the
  text output of psql into another program.
  
   On a related note, there's something odd with the pager code.
  
  Hm, what platform are you testing that on?
 
 Debian GNU/Linux (unstable)
 linux2.6.30
 eglibc   2.10.1
 libreadline6 6.0.5
 libncurses5  5.7
 gcc  4.3.4
 
 This is the trace of the broken write:
 
 16206 write(1,   Name   \342\224\202  Owner   \342\224..., 102) = 
 102
 16206 write(1, 
 \342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342
 \224..., 256) = 256
 16206 write(1, 
 \224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\n, 18) 
 = 18

Further tracing showed this to be a bug in the util-linux version of
more which had a static 256 byte line buffer.  The above was a red
herring--it's writing to a pipe.  I've sent a patch to fix this by
increasing the buffer size.


-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Peter Eisentraut
On sön, 2009-10-25 at 23:48 +, Roger Leigh wrote:
 Just for reference, this is what the output looks like (abridged)
 using the attached patch.  Should display fine if your mail client handles
 UTF-8 messages correctly:
 
 rleigh=# \l
  List of databases
   Name   │  Owner   │ Encoding │  Collation  │Ctype│   Access 
 privileges
 ─┼──┼──┼─┼─┼───
  merkelpb│ rleigh   │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
  postgres│ postgres │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
  projectb│ rleigh   │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
  rleigh  │ rleigh   │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
 […]
  template0   │ postgres │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │ 
 =c/postgres  ↵
  │  │  │ │ │ 
 postgres=CTc/postgres
  template1   │ postgres │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │ 
 =c/postgres  ↵
  │  │  │ │ │ 
 postgres=CTc/postgres
 […]
 (17 rows)

That's pretty much what I had in mind.  Cool.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Greg Stark
2009/10/25 Roger Leigh rle...@codelibre.net:
 rleigh=# \l
                                     List of databases
      Name       │  Owner   │ Encoding │  Collation  │    Ctype    │   Access 
 privileges
 ─┼──┼──┼─┼─┼───
  merkelpb        │ rleigh   │ UTF8     │ en_GB.UTF-8 │ en_GB.UTF-8 │
  postgres        │ postgres │ UTF8     │ en_GB.UTF-8 │ en_GB.UTF-8 │
  projectb        │ rleigh   │ UTF8     │ en_GB.UTF-8 │ en_GB.UTF-8 │
  rleigh          │ rleigh   │ UTF8     │ en_GB.UTF-8 │ en_GB.UTF-8 │
 […]
  template0       │ postgres │ UTF8     │ en_GB.UTF-8 │ en_GB.UTF-8 │ 
 =c/postgres          ↵
                 │          │          │             │             │ 
 postgres=CTc/postgres
  template1       │ postgres │ UTF8     │ en_GB.UTF-8 │ en_GB.UTF-8 │ 
 =c/postgres          ↵
                 │          │          │             │             │ 
 postgres=CTc/postgres
 […]
 (17 rows)



While i agree this looks nicer I wonder what it does to things like
excel/gnumeric/ooffice auto-recognizing table layouts and importing
files. I'm not sure our old format was so great for this so maybe this
is actually an improvement I'm asking for. But as long as we're
changing the format... It would at at least be good to test the
behaviour



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 While i agree this looks nicer I wonder what it does to things like
 excel/gnumeric/ooffice auto-recognizing table layouts and importing
 files. I'm not sure our old format was so great for this so maybe this
 is actually an improvement I'm asking for.

Yeah.  We can do what we like with the UTF8 format but I'm considerably
more worried about the aspect of making random changes to the
plain-ASCII output.  On the other hand, we changed that just a release
or so ago (to put in the multiline output in the first place) and
I didn't hear complaints about it that time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Roger Leigh
On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
 Greg Stark gsst...@mit.edu writes:
  While i agree this looks nicer I wonder what it does to things like
  excel/gnumeric/ooffice auto-recognizing table layouts and importing
  files. I'm not sure our old format was so great for this so maybe this
  is actually an improvement I'm asking for.
 
 Yeah.  We can do what we like with the UTF8 format but I'm considerably
 more worried about the aspect of making random changes to the
 plain-ASCII output.  On the other hand, we changed that just a release
 or so ago (to put in the multiline output in the first place) and
 I didn't hear complaints about it that time.

I checked (using strace)

gnumeric (via libgda and gnome-database-properties)
openoffice (oobase)

Both spreadsheets require a connection to be set up first for them to
use as a handle, so I did that and traced from there.  Neither made
any use of psql; they both appear to use libpq via their respective
database abstraction libs--no forking of any children observed.

Excel is a bit tougher, I bought my first copy last week for other
reasons, but I lack both windows expertise and debugging tools to trace
things, and I also dual boot my computer with the postgres install on
the Linux partition, making connecting to the database rather hard!  I
think someone else is better suited to check this one!


On a related note, there's something odd with the pager code.  The output
of \l with the pager off:

rleigh=# \l
 List of databases
  Name   │  Owner   │ Encoding │  Collation  │Ctype│   Access 
privileges
─┼──┼──┼─┼─┼───
[...]

(header line is 91 characters, 273 bytes)

And with the pager on:

rleigh=# \l
 List of databases
  Name   │  Owner   │ Encoding │  Collation  │Ctype│   Access 
privileges   
─┼──┼──┼─┼─┼─
  
��─
[...]

(longest header line 85 characters, 255 bytes, 256 bytes inc. LF,
remainder on second line)

Note that the pager wasn't required and so wasn't actually invoked, but
the output was corrupted.  A newline was inserted almost at the end of
the line and the continuation lacks a leading \342 which (since these
UTF-8 codes are all three-byte) leads to two bytes which are invalid
UTF-8.  Since this spurious newline got inserted exactly on a 256 byte
boundary, I was wondering if there was some buffer either internal to
psql or in the termios/pty layer that was getting flushed.  It also
lost the first byte of the second line (possibly swapped for the \n).

Another wierdness: it only happens if the terminal width is  85
columns wide, otherwise it just wraps around as one would expect!
AFAICT there are no 255/256 length buffers in the code, and the code
doing the printing is just doing stdio to fout which is either stdout
or a pipe!  Because of this, I can't see how the spurious \n appears
in the middle of a simple loop.  If border=2, you'll see this for all
top mid and bottom ruled lines.

I do see strace showing some termios fiddling, could that be at fault
or is that just readline ncurses initialisation?


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
 Yeah.  We can do what we like with the UTF8 format but I'm considerably
 more worried about the aspect of making random changes to the
 plain-ASCII output.

 I checked (using strace)
 gnumeric (via libgda and gnome-database-properties)
 openoffice (oobase)

Even if that were the entire universe of programs we cared about,
whether their internal ODBC logic goes through psql isn't really
the point here.  What I'm worried about is somebody piping the
text output of psql into another program.

 On a related note, there's something odd with the pager code.

Hm, what platform are you testing that on?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-26 Thread Roger Leigh
On Mon, Oct 26, 2009 at 07:19:24PM -0400, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
  Yeah.  We can do what we like with the UTF8 format but I'm considerably
  more worried about the aspect of making random changes to the
  plain-ASCII output.
 
  I checked (using strace)
  gnumeric (via libgda and gnome-database-properties)
  openoffice (oobase)
 
 Even if that were the entire universe of programs we cared about,
 whether their internal ODBC logic goes through psql isn't really
 the point here.  What I'm worried about is somebody piping the
 text output of psql into another program.
 
  On a related note, there's something odd with the pager code.
 
 Hm, what platform are you testing that on?

Debian GNU/Linux (unstable)
linux2.6.30
eglibc   2.10.1
libreadline6 6.0.5
libncurses5  5.7
gcc  4.3.4

This is the trace of the broken write:

16206 write(1,   Name   \342\224\202  Owner   \342\224..., 102) = 102
16206 write(1, 
\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342
\224..., 256) = 256
16206 write(1, 
\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\n, 18) = 
18

I'll attach the whole thing for reference.  What's clear is that the
first write was *exactly* 256 bytes, which is what was requested,
presumably by libc stdio buffering (which shouldn't by itself be a
problem).  Since we use 3-byte UTF-8 and 256/3 is 85 + 1 remainder,
this is where the wierd 85 char forced newline comes from.  Since it
only happens when the terminal window is 85 chars, that's where I'm
assuming some odd termios influence comes from (though it might just
be the source of the window size and be completely innocent).  The
fact that libc did the two separate writes kind of rules out termios
mangling the output post-write().


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
16203 execve(./psql, [./psql], [/* 43 vars */]) = 0
16203 brk(0)= 0x1dbb000
16203 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7ff0b8697000
16203 access(/etc/ld.so.nohwcap, F_OK) = -1 ENOENT (No such file or directory)
16203 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7ff0b8695000
16203 access(/etc/ld.so.preload, R_OK) = -1 ENOENT (No such file or directory)
16203 open(/usr/local/pgsql/lib/tls/x86_64/libpq.so.5, O_RDONLY) = -1 ENOENT 
(No such file or directory)
16203 stat(/usr/local/pgsql/lib/tls/x86_64, 0x7fffeefa1f10) = -1 ENOENT (No 
such file or directory)
16203 open(/usr/local/pgsql/lib/tls/libpq.so.5, O_RDONLY) = -1 ENOENT (No 
such file or directory)
16203 stat(/usr/local/pgsql/lib/tls, 0x7fffeefa1f10) = -1 ENOENT (No such 
file or directory)
16203 open(/usr/local/pgsql/lib/x86_64/libpq.so.5, O_RDONLY) = -1 ENOENT (No 
such file or directory)
16203 stat(/usr/local/pgsql/lib/x86_64, 0x7fffeefa1f10) = -1 ENOENT (No such 
file or directory)
16203 open(/usr/local/pgsql/lib/libpq.so.5, O_RDONLY) = -1 ENOENT (No such 
file or directory)
16203 stat(/usr/local/pgsql/lib, 0x7fffeefa1f10) = -1 ENOENT (No such file or 
directory)
16203 open(/etc/ld.so.cache, O_RDONLY) = 3
16203 fstat(3, {st_mode=S_IFREG|0644, st_size=115844, ...}) = 0
16203 mmap(NULL, 115844, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff0b8678000
16203 close(3)  = 0
16203 access(/etc/ld.so.nohwcap, F_OK) = -1 ENOENT (No such file or directory)
16203 open(/usr/lib/libpq.so.5, O_RDONLY) = 3
16203 read(3, 
\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\0\1\0\0\0\0\214\0\0\0\0\0\0..., 832) = 
832
16203 fstat(3, {st_mode=S_IFREG|0644, st_size=161848, ...}) = 0
16203 mmap(NULL, 2257336, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x7ff0b845
16203 mprotect(0x7ff0b8475000, 2097152, PROT_NONE) = 0
16203 mmap(0x7ff0b8675000, 12288, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7ff0b8675000
16203 close(3)  = 0
16203 access(/etc/ld.so.nohwcap, F_OK) = -1 ENOENT (No such file or directory)
16203 open(/lib/libreadline.so.5, O_RDONLY) = 3
16203 read(3, 
\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\0\1\0\0\0pMa\3157\0\0\0..., 832) = 832
16203 fstat(3, {st_mode=S_IFREG|0644, st_size=260632, ...}) = 0
16203 mmap(0x37cd60, 2358528, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x37cd60
16203 mprotect(0x37cd638000, 2093056, PROT_NONE) = 0
16203 mmap(0x37cd837000, 32768, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x37000) = 0x37cd837000
16203 mmap(0x37cd83f000, 3328, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x37cd83f000
16203 close(3)  = 0
16203 access(/etc/ld.so.nohwcap, 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-25 Thread Roger Leigh
On Sat, Oct 24, 2009 at 06:23:24PM +0100, Roger Leigh wrote:
 On Fri, Oct 16, 2009 at 01:38:15PM +0300, Peter Eisentraut wrote:
  I like the new Unicode tables, but the marking of continuation lines
  looks pretty horrible:
  
   List of databases
   Name  │ Owner │ Encoding │ Collation │ Ctype │ Access
  privileges
  ───┼───┼──┼───┼───┼───
   pl_regression │ peter │ LATIN2   │ cs_CZ │ cs_CZ │
   postgres  │ peter │ LATIN2   │ cs_CZ │ cs_CZ │
   template0 │ peter │ LATIN2   │ cs_CZ │ cs_CZ │ =c/peter
 ╷   ╷  ╷   ╷   ╎ peter=CTc/peter
   template1 │ peter │ LATIN2   │ cs_CZ │ cs_CZ │ =c/peter
 ╷   ╷  ╷   ╷   ╎ peter=CTc/peter
  (4 rows)

Just for reference, this is what the output looks like (abridged)
using the attached patch.  Should display fine if your mail client handles
UTF-8 messages correctly:

rleigh=# \l
 List of databases
  Name   │  Owner   │ Encoding │  Collation  │Ctype│   Access 
privileges
─┼──┼──┼─┼─┼───
 merkelpb│ rleigh   │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
 postgres│ postgres │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
 projectb│ rleigh   │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
 rleigh  │ rleigh   │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │
[…]
 template0   │ postgres │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │ 
=c/postgres  ↵
 │  │  │ │ │ 
postgres=CTc/postgres
 template1   │ postgres │ UTF8 │ en_GB.UTF-8 │ en_GB.UTF-8 │ 
=c/postgres  ↵
 │  │  │ │ │ 
postgres=CTc/postgres
[…]
(17 rows)

  This looks more like a rendering mistake than something sensible, and
  also it doesn't actually help the viewer to tell what lines are
  continued, which was the original purpose.
 
 I've worked on a solution to this, and the preliminary patch for this
 is attached.  Note there are additional comments in the patch text.

I've attached an updated version of the patch.  This now
- adds one column extra padding in wrapped mode if border=0 to allow
  correct display of wrap marks on the rightmost column.  Removed
  special cases needed to suppress marks on the rightmost column.
- updated unicode symbols to use more commonly-available symbol
  (ellipsis) rather than hooked arrows.  Also makes wrapping more
  visually distinct cf. newlines.  The CR and ellipsis symbols are
  commonly available in several character sets other than unicode,
  and are available in many common fonts, including all the Windows
  monospaced terminal fonts.  Symbol availability isn't an issue on
  GNU/Linux.
- updated ASCII symbols to use similar-looking symbols to the Unicode
  ones.
- added some more comments to code
- removed redundant enum values

I've also attached a trivial test script to run through psql to test
the wrapping, as well as the output I get for psql from 8.4.1 and
the current mainline (both Unicode and ASCII) for comparison.
Hopefully I'm not the only one that finds the proposed change clearer
and more logical than the existing display!


There's just one tiny display glitch I can see, and that's if you have
mixed wrapping and newlines, you miss the lefthand wrap mark if the line
is the last wrapped line and it ends in a newline.  It might not be
possible to pick that up if we can't discover that from the line state
when printing out the line.  Might possibly require storing the wrap
state so we know what happened on the previous line, though if there's
a slightly cleverer approach to detecting if we're already wrapped
that would be better.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
From 80a956c0d24e53a1b44619774b099364a585cdd0 Mon Sep 17 00:00:00 2001
From: Roger Leigh rle...@debian.org
Date: Sat, 24 Oct 2009 18:11:01 +0100
Subject: [PATCH] psql: Clean up line wrapping for Unicode display
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, there were two different methods for displaying
wrapping:
1) Columns headers used a + symbol in the left-hand border
   when the column heading contained newlines.
2) Data lines used either a : or ; symbol to represent
   wrapped and new lines, and   to represent empty lines.
   These symbols replaced the | immediately to the left of the
   column.

This patch unifies both of these, by making (1)-like formatting
the behaviour for all new lines in both column headers and data.
It indicates wrapping by using . symbols in the border for

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-25 Thread Roger Leigh
On Sun, Oct 25, 2009 at 11:48:27PM +, Roger Leigh wrote:
 There's just one tiny display glitch I can see, and that's if you have
 mixed wrapping and newlines, you miss the lefthand wrap mark if the line
 is the last wrapped line and it ends in a newline.  It might not be
 possible to pick that up if we can't discover that from the line state
 when printing out the line.  Might possibly require storing the wrap
 state so we know what happened on the previous line, though if there's
 a slightly cleverer approach to detecting if we're already wrapped
 that would be better.

This appears simpler and more robust, so I've gone with this approach,
and attached a new patch which fixes it.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 026e043..36183f2 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -54,9 +54,10 @@ const printTextFormat pg_asciiformat =
 		{ -, +, +, + },
 		{ ,  |, |, | }
 	},
-	:,
-	;,
-	 
+	 ,
+	+,
+	.,
+	.
 };
 
 const printTextFormat pg_utf8format =
@@ -72,12 +73,13 @@ const printTextFormat pg_utf8format =
 		/* N/A, │, │, │ */
 		{ , \342\224\202, \342\224\202, \342\224\202 }
 	},
-	/* ╎ */
-	\342\225\216,
-	/* ┊ */
-	\342\224\212,
-	/* ╷ */
-	\342\225\267
+	 ,
+	/* ↵ */
+	\342\206\265,
+	/* … */
+	\342\200\246,
+	/* … */
+	\342\200\246
 };
 
 
@@ -479,6 +481,8 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	int			output_columns = 0;		/* Width of interactive console */
 	bool		is_pager = false;
 
+	printTextLineWrap *wrap;
+
 	if (cancel_pressed)
 		return;
 
@@ -499,6 +503,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		format_buf = pg_local_calloc(col_count, sizeof(*format_buf));
 		header_done = pg_local_calloc(col_count, sizeof(*header_done));
 		bytes_output = pg_local_calloc(col_count, sizeof(*bytes_output));
+		wrap = pg_local_calloc(col_count, sizeof(*wrap));
 	}
 	else
 	{
@@ -513,6 +518,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		format_buf = NULL;
 		header_done = NULL;
 		bytes_output = NULL;
+		wrap = NULL;
 	}
 
 	/* scan all column headers, find maximum width and max max_nl_lines */
@@ -575,7 +581,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 
 	/* adjust the total display width based on border style */
 	if (opt_border == 0)
-		width_total = col_count - 1;
+		width_total = col_count;
 	else if (opt_border == 1)
 		width_total = col_count * 3 - 1;
 	else
@@ -770,16 +776,16 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			while (more_col_wrapping)
 			{
 if (opt_border == 2)
-	fprintf(fout, %s%c, dformat-leftvrule,
-			curr_nl_line ? '+' : ' ');
-else if (opt_border == 1)
-	fputc(curr_nl_line ? '+' : ' ', fout);
+	fputs(dformat-leftvrule, fout);
 
 for (i = 0; i  cont-ncolumns; i++)
 {
 	struct lineptr *this_line = col_lineptrs[i] + curr_nl_line;
 	unsigned int nbspace;
 
+	if (opt_border != 0)
+		fputs(curr_nl_line ? format-cont_left :  , fout);
+
 	if (!header_done[i])
 	{
 		nbspace = width_wrap[i] - this_line-width;
@@ -796,21 +802,16 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	}
 	else
 		fprintf(fout, %*s, width_wrap[i], );
-	if (i  col_count - 1)
-	{
-		if (opt_border == 0)
-			fputc(curr_nl_line ? '+' : ' ', fout);
-		else
-			fprintf(fout,  %s%c, dformat-midvrule,
-	curr_nl_line ? '+' : ' ');
-	}
+
+	fputs(!header_done[i] ? format-cont_right :  , fout);
+
+	if (opt_border != 0  i  col_count - 1)
+		fputs(dformat-midvrule, fout);
 }
 curr_nl_line++;
 
 if (opt_border == 2)
-	fprintf(fout,  %s, dformat-rightvrule);
-else if (opt_border == 1)
-	fputc(' ', fout);
+	fputs(dformat-rightvrule, fout);
 fputc('\n', fout);
 			}
 
@@ -861,19 +862,28 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 
 			/* left border */
 			if (opt_border == 2)
-fprintf(fout, %s , dformat-leftvrule);
-			else if (opt_border == 1)
-fputc(' ', fout);
+fputs(dformat-leftvrule, fout);
 
 			/* for each column */
 			for (j = 0; j  col_count; j++)
 			{
 /* We have a valid array element, so index it */
 struct lineptr *this_line = col_lineptrs[j][curr_nl_line[j]];
-int			bytes_to_output;
-int			chars_to_output = width_wrap[j];
+int		bytes_to_output;
+int		chars_to_output = width_wrap[j];
 bool		finalspaces = (opt_border == 2 || j  col_count - 1);
 
+/* Print left-hand wrap or newline mark */
+if (opt_border != 0)
+{
+	if (wrap[j] == PRINT_LINE_WRAP_WRAP)
+		fputs(format-wrap_left, fout);
+	else if (wrap[j] == PRINT_LINE_WRAP_CONT)
+		

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-24 Thread Roger Leigh
On Fri, Oct 16, 2009 at 01:38:15PM +0300, Peter Eisentraut wrote:
 I like the new Unicode tables, but the marking of continuation lines
 looks pretty horrible:
 
  List of databases
  Name  │ Owner │ Encoding │ Collation │ Ctype │ Access
 privileges
 ───┼───┼──┼───┼───┼───
  pl_regression │ peter │ LATIN2   │ cs_CZ │ cs_CZ │
  postgres  │ peter │ LATIN2   │ cs_CZ │ cs_CZ │
  template0 │ peter │ LATIN2   │ cs_CZ │ cs_CZ │ =c/peter
╷   ╷  ╷   ╷   ╎ peter=CTc/peter
  template1 │ peter │ LATIN2   │ cs_CZ │ cs_CZ │ =c/peter
╷   ╷  ╷   ╷   ╎ peter=CTc/peter
 (4 rows)
 
 This looks more like a rendering mistake than something sensible, and
 also it doesn't actually help the viewer to tell what lines are
 continued, which was the original purpose.

I've worked on a solution to this, and the preliminary patch for this
is attached.  Note there are additional comments in the patch text.

This patch does break backward compatibility with older psql versions,
by using symbols in the column border to mark newlines and wrapped
lines rather than putting ':' and ';' and ' ' symbols in the vertical
table lines.  This makes things somewhat more consistent and readable
but at the expense of not perfectly preserving output.  The ASCII
rules are a little more compatible than the Unicode rules, but both
do break things a little.  If the data lines do not contain either
newlines or wrapped text, then the output will remain unchanged.

Any feedback would be appreciated.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
From 245c6cabe7d74afef976149fd675bd99d711cade Mon Sep 17 00:00:00 2001
From: Roger Leigh rle...@debian.org
Date: Sat, 24 Oct 2009 18:11:01 +0100
Subject: [PATCH] psql: Clean up line wrapping for Unicode display
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, there were two different methods for displaying
wrapping:
1) Columns headers used a + symbol in the left-hand border
   when the column heading contained newlines.
2) Data lies used either a : or ; symbol to represent
   wrapped and new lines, and   to represent empty lines.
   These symbols replaced the | immediately to the left of the
   column.

This patch unifies both of these, by making (1) the behaviour
for all new lines in both column headers and data.  It indicates
wrapping by using - symbols in the border for continuations.
When using Unicode, these are replaced by ↵ (carriage return)
for newlines and ↩ and ↪ (curly arrows) for continuations.

In the special case of a border size of zero, only the right-hand
border symbols are drawn, since there is no space for any other
output.  The right-most column will not display any symbols for
backward compatibility, but this can be adjusted (I think it would
make the display somewhat more consistent).
---
 src/bin/psql/print.c |  114 -
 src/bin/psql/print.h |   17 ++-
 2 files changed, 79 insertions(+), 52 deletions(-)

diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 026e043..f20eab1 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -54,9 +54,10 @@ const printTextFormat pg_asciiformat =
 		{ -, +, +, + },
 		{ ,  |, |, | }
 	},
-	:,
-	;,
-	 
+	+,
+	 ,
+	-,
+	-
 };
 
 const printTextFormat pg_utf8format =
@@ -72,12 +73,13 @@ const printTextFormat pg_utf8format =
 		/* N/A, │, │, │ */
 		{ , \342\224\202, \342\224\202, \342\224\202 }
 	},
-	/* ╎ */
-	\342\225\216,
-	/* ┊ */
-	\342\224\212,
-	/* ╷ */
-	\342\225\267
+	 ,
+	/* ↵ */
+	\342\206\265,
+	/* ↪ */
+	\342\206\252,
+	/* ↩ */
+	\342\206\251
 };
 
 
@@ -770,16 +772,16 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			while (more_col_wrapping)
 			{
 if (opt_border == 2)
-	fprintf(fout, %s%c, dformat-leftvrule,
-			curr_nl_line ? '+' : ' ');
-else if (opt_border == 1)
-	fputc(curr_nl_line ? '+' : ' ', fout);
+	fputs(dformat-leftvrule, fout);
 
 for (i = 0; i  cont-ncolumns; i++)
 {
 	struct lineptr *this_line = col_lineptrs[i] + curr_nl_line;
 	unsigned int nbspace;
 
+	if (opt_border != 0)
+		fputs(curr_nl_line ? format-cont_left :  , fout);
+
 	if (!header_done[i])
 	{
 		nbspace = width_wrap[i] - this_line-width;
@@ -796,21 +798,16 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	}
 	else
 		fprintf(fout, %*s, width_wrap[i], );
-	if (i  col_count - 1)
-	{
-		if (opt_border == 0)
-			fputc(curr_nl_line ? '+' : ' ', fout);
-		else
-			fprintf(fout,  %s%c, dformat-midvrule,
-	curr_nl_line ? '+' : ' 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-16 Thread Peter Eisentraut
(HTML mail to preserve formatting; let's see if it works.)

I like the new Unicode tables, but the marking of continuation lines
looks pretty horrible:

 List of databases
 Name  │ Owner │ Encoding │ Collation │ Ctype │ Access
privileges
───┼───┼──┼───┼───┼───
 pl_regression │ peter │ LATIN2   │ cs_CZ │ cs_CZ │
 postgres  │ peter │ LATIN2   │ cs_CZ │ cs_CZ │
 template0 │ peter │ LATIN2   │ cs_CZ │ cs_CZ │ =c/peter
   ╷   ╷  ╷   ╷   ╎ peter=CTc/peter
 template1 │ peter │ LATIN2   │ cs_CZ │ cs_CZ │ =c/peter
   ╷   ╷  ╷   ╷   ╎ peter=CTc/peter
(4 rows)

This looks more like a rendering mistake than something sensible, and
also it doesn't actually help the viewer to tell what lines are
continued, which was the original purpose.

On a side note, I don't really understand why the access privileges need
to be broken up over multiple lines.  There is plenty of space left on
the line.

Note that the above is close to a default setup, so that is what many
people will see by default from now on.


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-16 Thread Roger Leigh
On Fri, Oct 16, 2009 at 01:38:15PM +0300, Peter Eisentraut wrote:
 (HTML mail to preserve formatting; let's see if it works.)
 
 I like the new Unicode tables, but the marking of continuation lines
 looks pretty horrible:

Yes, I'm not so keen myself.  The ASCII characters used are '|', ':' and
' ' for normal, wrapped and newlines.  Here, we are using vertical lines
with two or three dashes to replace the latter two.

As mentioned earlier in the thread, this patch will be followed up
with a futher patch to unify the folding/newline behaviour between
column header and data lines, which will replace these with normal
vertical lines in all cases, with the addition of characters in the
margin such as as a CR symbol or ellipsis to indicate newlines or
wrapping.  To preserve backward compatibility, the ASCII output will
remain unchanged.

I would have done this in the patch as applied, but was asked to hold
off doing that until the initial work was committed.

 Note that the above is close to a default setup, so that is what many
 people will see by default from now on.

I'll try to get the above proposed change done in the next week or so,
in time for the next CommitFest, if you agree with the general idea.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-14 Thread Roger Leigh
On Tue, Oct 13, 2009 at 05:08:20PM -0400, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  The attached updated patch renames all user-visible uses of
  utf8 to unicode.  It also updates the documentation
  regarding locale to psql client character set encoding
  so the docs now match the code exactly.
 
 Applied with light editorialization.  The main non-cosmetic change
 I made was to postpone selection of default line_style until runtime
 (see get_line_style() in the committed patch).  The original coding
 required knowledge of the line_style default rule not only in three
 different places in psql, but in every other place using print.c,
 such as createlang/droplang -l (which dumped core with the patch as
 submitted).  I changed it so that leaving line_style NULL implies
 the default encoding-driven behavior, so that we don't need to touch
 any of the callers.

Thanks.  I agree with your improvement in get_line_style(), it's
cleaner this way; I didn't realise print.c was used by the other
programs, sorry about that.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-13 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 The attached updated patch renames all user-visible uses of
 utf8 to unicode.  It also updates the documentation
 regarding locale to psql client character set encoding
 so the docs now match the code exactly.

Applied with light editorialization.  The main non-cosmetic change
I made was to postpone selection of default line_style until runtime
(see get_line_style() in the committed patch).  The original coding
required knowledge of the line_style default rule not only in three
different places in psql, but in every other place using print.c,
such as createlang/droplang -l (which dumped core with the patch as
submitted).  I changed it so that leaving line_style NULL implies
the default encoding-driven behavior, so that we don't need to touch
any of the callers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-11 Thread Roger Leigh
On Fri, Oct 09, 2009 at 04:35:46PM -0500, Kevin Grittner wrote:
 Peter Eisentraut pete...@gmx.net wrote:
  
  I think the setting ought be called linestyle unicode (instead of
  utf8), since the same setting would presumably work in case we ever
  implement UTF-16 support on the client side.
  
 Yeah, anytime one gets sloppy with the distinction between a character
 set and a character encoding scheme, one tends to regret it, sooner or
 later.  Here's we're talking about which glyphs to show -- that's
 based on a character set.

The attached updated patch renames all user-visible uses of
utf8 to unicode.  It also updates the documentation
regarding locale to psql client character set encoding
so the docs now match the code exactly.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..c3b2c28 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1690,6 +1690,53 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliterallinestyle/literal/term
+  listitem
+  para
+	  Sets the line drawing style of text table output to one
+  of literalascii/literal or literalunicode/literal.
+  Unique abbreviations are allowed.  (That would mean one
+  letter is enough.)  literalunicode/literal will be
+  selected if used by the psql client character set encoding,
+  otherwise literalascii/literal will be used by default.
+  /para
+
+  para
+	  Query result tables are displayed as text for some output
+	  formats (literalunaligned/literal,
+  literalaligned/literal, and literalwrapped/literal
+  formats).  The tables are drawn using characters of the psql
+  client character set encoding.  By default,
+  the acronymASCII/acronym character set will be used,
+  which will display correctly for all encodings.  However, if
+  the user is using an encoding based upon the Universal
+  Character Set (quoteUnicode/quote), for
+  example acronymUTF8/acronym, the Unicode box drawing
+  characters will be used in place of ASCII punctuation in
+  order to display more readable tables.
+  /para
+
+  para
+	  This option is useful for overriding the default line style,
+	  for example to force the use of
+	  only acronymASCII/acronym characters when extended
+	  character sets such as Unicode are inappropriate.  This
+	  might be the case if preserving output compatibility with
+	  older psql versions is important (prior to 8.5.0).
+  /para
+
+  para
+  quoteUnicode/quote use Unicode box drawing characters.
+  /para
+
+  para
+  quoteASCII/quote use plain acronymASCII/acronym characters.
+  /para
+
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralcolumns/literal/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index da57eb4..6f34e68 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -46,6 +46,7 @@
 #include input.h
 #include large_obj.h
 #include mainloop.h
+#include mbprint.h
 #include print.h
 #include psqlscan.h
 #include settings.h
@@ -596,6 +597,14 @@ exec_command(const char *cmd,
 /* save encoding info into psql internal data */
 pset.encoding = PQclientEncoding(pset.db);
 pset.popt.topt.encoding = pset.encoding;
+if (!pset.popt.topt.line_style_set)
+{
+	if (pset.encoding == get_utf8_id())
+		pset.popt.topt.line_style = utf8format;
+	else
+		pset.popt.topt.line_style = asciiformat;
+}
+
 SetVariable(pset.vars, ENCODING,
 			pg_encoding_to_char(pset.encoding));
 			}
@@ -1432,6 +1441,13 @@ SyncVariables(void)
 	/* get stuff from connection */
 	pset.encoding = PQclientEncoding(pset.db);
 	pset.popt.topt.encoding = pset.encoding;
+	if (!pset.popt.topt.line_style_set)
+	{
+		if (pset.encoding == get_utf8_id())
+			pset.popt.topt.line_style = utf8format;
+		else
+			pset.popt.topt.line_style = asciiformat;
+	}
 	pset.sversion = PQserverVersion(pset.db);
 
 	SetVariable(pset.vars, DBNAME, PQdb(pset.db));
@@ -1772,6 +1788,27 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			printf(_(Output format is %s.\n), _align2string(popt-topt.format));
 	}
 
+	/* set table line style */
+	else if (strcmp(param, linestyle) == 0)
+	{
+		if (!value)
+			;
+		else if (pg_strncasecmp(ascii, value, vallen) == 0)
+			popt-topt.line_style = asciiformat;
+		else if (pg_strncasecmp(unicode, value, vallen) == 0)
+			popt-topt.line_style = utf8format;
+		else
+		{
+			psql_error(\\pset: allowed 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-09 Thread Peter Eisentraut
On Tue, 2009-10-06 at 19:35 +0100, Roger Leigh wrote:
 This patch included a bit of code not intended for inclusion
 (setting of client encoding based on locale), which the attached
 (and hopefully final!) revision of the patch excludes.

Well, the documentation still claims that this is dependent on the
locale.  This should be updated to match the code.

I think the setting ought be called linestyle unicode (instead of utf8),
since the same setting would presumably work in case we ever implement
UTF-16 support on the client side.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-09 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote:
 
 I think the setting ought be called linestyle unicode (instead of
 utf8), since the same setting would presumably work in case we ever
 implement UTF-16 support on the client side.
 
Yeah, anytime one gets sloppy with the distinction between a character
set and a character encoding scheme, one tends to regret it, sooner or
later.  Here's we're talking about which glyphs to show -- that's
based on a character set.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-07 Thread Brad T. Sliger
On Tuesday 06 October 2009 11:35:03 Roger Leigh wrote:
 On Tue, Oct 06, 2009 at 10:44:27AM +0100, Roger Leigh wrote:
  On Mon, Oct 05, 2009 at 04:32:08PM -0400, Tom Lane wrote:
   Roger Leigh rle...@codelibre.net writes:
On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:
Elsewhere in the psql code, notably in mbprint.c, we make the
decision on whether to apply certain Unicode-aware processing based
on whether the client encoding is UTF8.  The same should be done
here.
   
There is a patch somewhere in the pipeline that would automatically
set the psql client encoding to whatever the locale says, but until
that is done, the client encoding should be the sole setting that
rules what kind of character set processing is done on the client
side.
   
OK, that makes sense to a certain extent.  However, the characters
used to draw the table lines are not really that related to the
client encoding for data sent from the database (IMHO).
  
   Huh?  The data *in* the table is going to be in the client_encoding,
   and psql contains no mechanisms that would translate it to something
   else. Surrounding it with decoration in a different encoding is just a
   recipe for breakage.
 
  Ah, I was under the mistaken assumption that this was iconv()ed or
  otherwise translated for correct display.  In that case, I'll leave
  the patch as is (using the client encoding for table lines).
 
  I've attached an updated copy of the patch (it just removes the
  now unneeded langinfo.h header).

 This patch included a bit of code not intended for inclusion
 (setting of client encoding based on locale), which the attached
 (and hopefully final!) revision of the patch excludes.


 Regards,
 Roger

I looked at psql-utf8-table-9.patch.

It applies, builds and installs.  `gmake check` passes and is not 
affected by values of LANG or LC_ALL in the 
environment.  HTML and man documentation builds and looks good.  The patch 
doesn't introduce new lint.

The psql utility displays UTF8 line art when the client encoding is 
UTF8, and ASCII line art is displayed otherwise.  
This can be overridden with `\pset linestyle ASCII` or `\pset linestyle UTF8`.  
The psql line art is no longer affected by 
the values of LANG or LC_ALL in the environment.

As far as I can tell, everything looks reasonable.

Thanks,

--bts

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-06 Thread Roger Leigh
On Mon, Oct 05, 2009 at 04:32:08PM -0400, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:
  Elsewhere in the psql code, notably in mbprint.c, we make the decision
  on whether to apply certain Unicode-aware processing based on whether
  the client encoding is UTF8.  The same should be done here.
  
  There is a patch somewhere in the pipeline that would automatically set
  the psql client encoding to whatever the locale says, but until that is
  done, the client encoding should be the sole setting that rules what
  kind of character set processing is done on the client side.
 
  OK, that makes sense to a certain extent.  However, the characters
  used to draw the table lines are not really that related to the
  client encoding for data sent from the database (IMHO).
 
 Huh?  The data *in* the table is going to be in the client_encoding, and
 psql contains no mechanisms that would translate it to something else.
 Surrounding it with decoration in a different encoding is just a recipe
 for breakage.

Ah, I was under the mistaken assumption that this was iconv()ed or
otherwise translated for correct display.  In that case, I'll leave
the patch as is (using the client encoding for table lines).

I've attached an updated copy of the patch (it just removes the
now unneeded langinfo.h header).


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..cd1c137 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1690,6 +1690,54 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliterallinestyle/literal/term
+  listitem
+  para
+	  Sets the line drawing style of text table output to one
+  of literalascii/literal, or literalutf8/literal.
+  Unique abbreviations are allowed.  (That would mean one
+  letter is enough.)  literalutf8/literal will be selected
+  by default if supported by your locale,
+  otherwise literalascii/literal will be used.
+  /para
+
+  para
+	  Query result tables are displayed as text for some output
+	  formats (literalunaligned/literal,
+  literalaligned/literal, and literalwrapped/literal
+  formats).  The tables are drawn using characters of the
+  user's locale character set.  By
+  default, acronymASCII/acronym characters will be used,
+  which will display correctly in all locales.  However, if
+  the user is using a locale with a acronymUTF-8/acronym
+  character set, the default will be to
+  use acronymUTF-8/acronym box drawing characters in place
+  of ASCII punctuation to display more readable tables.
+  /para
+
+  para
+	  This option is useful for overriding the default line
+	  style, for example to force the use of
+	  only acronymASCII/acronym characters when extended
+	  character sets such as acronymUTF-8/acronym are
+	  inappropriate.  This might be the case if preserving output
+	  compatibility with older psql versions is important (prior
+	  to 8.5.0).
+  /para
+
+  para
+  quoteUTF8/quote use Unicode acronymUTF-8/acronym box
+  drawing characters.
+  /para
+
+  para
+  quoteASCII/quote use plain acronymASCII/acronym characters.
+  /para
+
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralcolumns/literal/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index da57eb4..be2b60b 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -33,6 +33,11 @@
 #include openssl/ssl.h
 #endif
 
+#include locale.h
+#ifdef HAVE_LANGINFO_H
+#include langinfo.h
+#endif
+
 #include portability/instr_time.h
 
 #include libpq-fe.h
@@ -46,6 +51,7 @@
 #include input.h
 #include large_obj.h
 #include mainloop.h
+#include mbprint.h
 #include print.h
 #include psqlscan.h
 #include settings.h
@@ -596,6 +602,14 @@ exec_command(const char *cmd,
 /* save encoding info into psql internal data */
 pset.encoding = PQclientEncoding(pset.db);
 pset.popt.topt.encoding = pset.encoding;
+if (!pset.popt.topt.line_style_set)
+{
+	if (pset.encoding == get_utf8_id())
+		pset.popt.topt.line_style = utf8format;
+	else
+		pset.popt.topt.line_style = asciiformat;
+}
+
 SetVariable(pset.vars, ENCODING,
 			pg_encoding_to_char(pset.encoding));
 			}
@@ -1430,8 +1444,19 @@ void
 SyncVariables(void)
 {
 	/* get stuff from connection */
+#if (defined(HAVE_LANGINFO_H)  defined(CODESET))
+	if (PQsetClientEncoding(pset.db, 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-06 Thread Roger Leigh
On Tue, Oct 06, 2009 at 10:44:27AM +0100, Roger Leigh wrote:
 On Mon, Oct 05, 2009 at 04:32:08PM -0400, Tom Lane wrote:
  Roger Leigh rle...@codelibre.net writes:
   On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:
   Elsewhere in the psql code, notably in mbprint.c, we make the decision
   on whether to apply certain Unicode-aware processing based on whether
   the client encoding is UTF8.  The same should be done here.
   
   There is a patch somewhere in the pipeline that would automatically set
   the psql client encoding to whatever the locale says, but until that is
   done, the client encoding should be the sole setting that rules what
   kind of character set processing is done on the client side.
  
   OK, that makes sense to a certain extent.  However, the characters
   used to draw the table lines are not really that related to the
   client encoding for data sent from the database (IMHO).
  
  Huh?  The data *in* the table is going to be in the client_encoding, and
  psql contains no mechanisms that would translate it to something else.
  Surrounding it with decoration in a different encoding is just a recipe
  for breakage.
 
 Ah, I was under the mistaken assumption that this was iconv()ed or
 otherwise translated for correct display.  In that case, I'll leave
 the patch as is (using the client encoding for table lines).
 
 I've attached an updated copy of the patch (it just removes the
 now unneeded langinfo.h header).

This patch included a bit of code not intended for inclusion
(setting of client encoding based on locale), which the attached
(and hopefully final!) revision of the patch excludes.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..cd1c137 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1690,6 +1690,54 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliterallinestyle/literal/term
+  listitem
+  para
+	  Sets the line drawing style of text table output to one
+  of literalascii/literal, or literalutf8/literal.
+  Unique abbreviations are allowed.  (That would mean one
+  letter is enough.)  literalutf8/literal will be selected
+  by default if supported by your locale,
+  otherwise literalascii/literal will be used.
+  /para
+
+  para
+	  Query result tables are displayed as text for some output
+	  formats (literalunaligned/literal,
+  literalaligned/literal, and literalwrapped/literal
+  formats).  The tables are drawn using characters of the
+  user's locale character set.  By
+  default, acronymASCII/acronym characters will be used,
+  which will display correctly in all locales.  However, if
+  the user is using a locale with a acronymUTF-8/acronym
+  character set, the default will be to
+  use acronymUTF-8/acronym box drawing characters in place
+  of ASCII punctuation to display more readable tables.
+  /para
+
+  para
+	  This option is useful for overriding the default line
+	  style, for example to force the use of
+	  only acronymASCII/acronym characters when extended
+	  character sets such as acronymUTF-8/acronym are
+	  inappropriate.  This might be the case if preserving output
+	  compatibility with older psql versions is important (prior
+	  to 8.5.0).
+  /para
+
+  para
+  quoteUTF8/quote use Unicode acronymUTF-8/acronym box
+  drawing characters.
+  /para
+
+  para
+  quoteASCII/quote use plain acronymASCII/acronym characters.
+  /para
+
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralcolumns/literal/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index da57eb4..223f11c 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -46,6 +46,7 @@
 #include input.h
 #include large_obj.h
 #include mainloop.h
+#include mbprint.h
 #include print.h
 #include psqlscan.h
 #include settings.h
@@ -596,6 +597,14 @@ exec_command(const char *cmd,
 /* save encoding info into psql internal data */
 pset.encoding = PQclientEncoding(pset.db);
 pset.popt.topt.encoding = pset.encoding;
+if (!pset.popt.topt.line_style_set)
+{
+	if (pset.encoding == get_utf8_id())
+		pset.popt.topt.line_style = utf8format;
+	else
+		pset.popt.topt.line_style = asciiformat;
+}
+
 SetVariable(pset.vars, ENCODING,
 			pg_encoding_to_char(pset.encoding));
 			}
@@ -1432,6 +1441,13 @@ SyncVariables(void)
 	/* get stuff from connection */
 	

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-05 Thread Roger Leigh
On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:
 I have a comment on this bit:
 
  @@ -125,6 +128,17 @@ main(int argc, char *argv[])
   
  /* We rely on unmentioned fields of pset.popt to start out
  0/false/NULL */
  pset.popt.topt.format = PRINT_ALIGNED;
  +
  +   /* Default table style to plain ASCII */
  +   pset.popt.topt.table_style = asciiformat;
  +#if (defined(HAVE_LANGINFO_H)  defined(CODESET))
  +   /* If a UTF-8 locale is available, switch to UTF-8 box drawing
  characters */
  +   if (pg_strcasecmp(nl_langinfo(CODESET), UTF-8) == 0 ||
  +   pg_strcasecmp(nl_langinfo(CODESET), utf8) == 0 ||
  +   pg_strcasecmp(nl_langinfo(CODESET), CP65001) == 0)
  +   pset.popt.topt.table_style = utf8format;
  +#endif
  +
  pset.popt.topt.border = 1;
  pset.popt.topt.pager = 1;
  pset.popt.topt.start_table = true;
 
 Elsewhere in the psql code, notably in mbprint.c, we make the decision
 on whether to apply certain Unicode-aware processing based on whether
 the client encoding is UTF8.  The same should be done here.
 
 There is a patch somewhere in the pipeline that would automatically set
 the psql client encoding to whatever the locale says, but until that is
 done, the client encoding should be the sole setting that rules what
 kind of character set processing is done on the client side.

OK, that makes sense to a certain extent.  However, the characters
used to draw the table lines are not really that related to the
client encoding for data sent from the database (IMHO).

I think that (as you said) making the client encoding the same as the
locale character set the same in the future would clear up this
discrepancy though.  Using the client encoding, there's no guarantee
the client locale/terminal can handle UTF-8 when the client encoding is
UTF-8.

I have attached an updated patch which implements your suggested
behaviour.  It also renames the option to linestyle rather than
tablestyle which I think represents its purpose a bit more clearly.


Thanks,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..cd1c137 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1690,6 +1690,54 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliterallinestyle/literal/term
+  listitem
+  para
+	  Sets the line drawing style of text table output to one
+  of literalascii/literal, or literalutf8/literal.
+  Unique abbreviations are allowed.  (That would mean one
+  letter is enough.)  literalutf8/literal will be selected
+  by default if supported by your locale,
+  otherwise literalascii/literal will be used.
+  /para
+
+  para
+	  Query result tables are displayed as text for some output
+	  formats (literalunaligned/literal,
+  literalaligned/literal, and literalwrapped/literal
+  formats).  The tables are drawn using characters of the
+  user's locale character set.  By
+  default, acronymASCII/acronym characters will be used,
+  which will display correctly in all locales.  However, if
+  the user is using a locale with a acronymUTF-8/acronym
+  character set, the default will be to
+  use acronymUTF-8/acronym box drawing characters in place
+  of ASCII punctuation to display more readable tables.
+  /para
+
+  para
+	  This option is useful for overriding the default line
+	  style, for example to force the use of
+	  only acronymASCII/acronym characters when extended
+	  character sets such as acronymUTF-8/acronym are
+	  inappropriate.  This might be the case if preserving output
+	  compatibility with older psql versions is important (prior
+	  to 8.5.0).
+  /para
+
+  para
+  quoteUTF8/quote use Unicode acronymUTF-8/acronym box
+  drawing characters.
+  /para
+
+  para
+  quoteASCII/quote use plain acronymASCII/acronym characters.
+  /para
+
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralcolumns/literal/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index da57eb4..223f11c 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -46,6 +46,7 @@
 #include input.h
 #include large_obj.h
 #include mainloop.h
+#include mbprint.h
 #include print.h
 #include psqlscan.h
 #include settings.h
@@ -596,6 +597,14 @@ exec_command(const char *cmd,
 /* save encoding info into psql internal data */
 pset.encoding = PQclientEncoding(pset.db);
 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-05 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:
 Elsewhere in the psql code, notably in mbprint.c, we make the decision
 on whether to apply certain Unicode-aware processing based on whether
 the client encoding is UTF8.  The same should be done here.
 
 There is a patch somewhere in the pipeline that would automatically set
 the psql client encoding to whatever the locale says, but until that is
 done, the client encoding should be the sole setting that rules what
 kind of character set processing is done on the client side.

 OK, that makes sense to a certain extent.  However, the characters
 used to draw the table lines are not really that related to the
 client encoding for data sent from the database (IMHO).

Huh?  The data *in* the table is going to be in the client_encoding, and
psql contains no mechanisms that would translate it to something else.
Surrounding it with decoration in a different encoding is just a recipe
for breakage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-04 Thread Roger Leigh
On Fri, Oct 02, 2009 at 05:34:16PM -0700, Brad T. Sliger wrote:
 On Friday 02 October 2009 04:21:35 Roger Leigh wrote:
  I have attached a patch which implements the feature as a pset
  variable.  This also slightly simplifies some of the patch since
  the table style is passed to functions directly in printTableContent
  rather than separately.  The psql option '-P tablestyle=ascii' is
  passed to psql in pg_regress_main.c which means the testsuite doesn't
  fail any more.  The option is documented in the psql docs, and is
  also tab-completed.  Users can just put '\pset tablestyle ascii' in
  their .psqlrc if they want the old format in a UTF-8 locale.
 
   I looked at psql-utf8-table-5.patch.

Many thanks for taking the time to do this.  I've attached a followup
patch which addresses your point below:

   Lint(1) says there is an extra trailing ',' in src/bin/psql/print.h. in 
 'typedef enum printTextRule'.  The addition to 
 src/bin/psql/command.c could use a comment, like adjacent code.

Fixed.

   'ASCII' and 'UTF8' may need acronym/acronym tags in 
 doc/src/sgml/ref/psql-ref.sgml, like adjacent 
 code.  I'm not sure someone who hasn't seen this patch in action would 
 immediately know what it does from the 
 documentation.  `gmake html` works without the patch, but fails with the 
 patch:

Also fixed.  I also added some additional explanation of the option which
hopefully makes its purpose more obvious.  The acronym tag isn't used for
the itemised option list names, but is used in the descriptive text; I can
also add it there if appropriate.

It's likely that tablestyle could well be named better.  format is
already used, but if there's a more intuitive name that fits better,
I'm happy to change it.
   openjade:ref/psql-ref.sgml:1692:15:E: document type does not allow 
 element TERM here; assuming 
 missing VARLISTENTRY start-tag

Also fixed.

   After the patch, `\pset format wrapped` produces '\pset: unknown 
 option: format'.  I saw this in interactive psql 
 and from .psqlrc.  I think this can be fixed by changing the addition to 
 src/bin/psql/command.c from an 'if' clause to 
 an 'else if' clause.

Oops, yes.  Sorry about that hiccup.  I've also fixed this.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..6f1bdc7 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1690,6 +1690,54 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteraltablestyle/literal/term
+  listitem
+  para
+	  Sets the style of text table output to one
+  of literalascii/literal, or literalutf8/literal.
+  Unique abbreviations are allowed.  (That would mean one
+  letter is enough.)  literalutf8/literal will be selected
+  by default if supported by your locale,
+  otherwise literalascii/literal will be used.
+  /para
+
+  para
+	  Query result tables are displayed as text for some output
+	  formats (literalunaligned/literal,
+  literalaligned/literal, and literalwrapped/literal
+  formats).  The tables are drawn using characters of the
+  user's locale character set.  By
+  default, acronymASCII/acronym characters will be used,
+  which will display correctly in all locales.  However, if
+  the user is using a locale with a acronymUTF-8/acronym
+  character set, the default will be to
+  use acronymUTF-8/acronym box drawing characters in place
+  of ASCII punctuation to display more readable tables.
+  /para
+
+  para
+	  This option is useful for overriding the default table
+	  style, for example to force the use of
+	  only acronymASCII/acronym characters when extended
+	  character sets such as acronymUTF-8/acronym are
+	  inappropriate.  This might be the case if preserving output
+	  compatibility with older psql versions is important (prior
+	  to 8.5.0).
+  /para
+
+  para
+  quoteUTF8/quote use Unicode acronymUTF-8/acronym box
+  drawing characters.
+  /para
+
+  para
+  quoteASCII/quote use plain acronymASCII/acronym characters.
+  /para
+
+  /listitem
+  /varlistentry
+
+  varlistentry
   termliteralcolumns/literal/term
   listitem
   para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index da57eb4..31631cf 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1772,6 +1772,25 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			printf(_(Output format is %s.\n), 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-04 Thread Peter Eisentraut
I have a comment on this bit:

 @@ -125,6 +128,17 @@ main(int argc, char *argv[])
  
 /* We rely on unmentioned fields of pset.popt to start out
 0/false/NULL */
 pset.popt.topt.format = PRINT_ALIGNED;
 +
 +   /* Default table style to plain ASCII */
 +   pset.popt.topt.table_style = asciiformat;
 +#if (defined(HAVE_LANGINFO_H)  defined(CODESET))
 +   /* If a UTF-8 locale is available, switch to UTF-8 box drawing
 characters */
 +   if (pg_strcasecmp(nl_langinfo(CODESET), UTF-8) == 0 ||
 +   pg_strcasecmp(nl_langinfo(CODESET), utf8) == 0 ||
 +   pg_strcasecmp(nl_langinfo(CODESET), CP65001) == 0)
 +   pset.popt.topt.table_style = utf8format;
 +#endif
 +
 pset.popt.topt.border = 1;
 pset.popt.topt.pager = 1;
 pset.popt.topt.start_table = true;

Elsewhere in the psql code, notably in mbprint.c, we make the decision
on whether to apply certain Unicode-aware processing based on whether
the client encoding is UTF8.  The same should be done here.

There is a patch somewhere in the pipeline that would automatically set
the psql client encoding to whatever the locale says, but until that is
done, the client encoding should be the sole setting that rules what
kind of character set processing is done on the client side.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-02 Thread Roger Leigh
On Wed, Sep 30, 2009 at 06:50:46PM -0400, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
  Thinking about this some more, ISTM a much better way of approaching 
  it would be to provide a flag for psql to turn off the fancy 
  formatting, and have pg_regress use that flag.
 
  The attached patch implements this feature.  It adds a
  --no-pretty-formatting/-G option to psql (naming isn't my fort�,
  so feel free to change it!).  This is also documented in the
  SGML docs, and help text.  Lastly, this option is used when invoking
  psql by pg_regress, which results in a working testsuite in a UTF-8
  locale.
 
 It would be a good idea to tie this to a psql magic variable (like
 ON_ERROR_STOP) so that it could conveniently be set in ~/.psqlrc.
 I'm not actually sure that we need a dedicated command line switch
 for it, since you could use -v varname instead.

I have attached a patch which implements the feature as a pset
variable.  This also slightly simplifies some of the patch since
the table style is passed to functions directly in printTableContent
rather than separately.  The psql option '-P tablestyle=ascii' is
passed to psql in pg_regress_main.c which means the testsuite doesn't
fail any more.  The option is documented in the psql docs, and is
also tab-completed.  Users can just put '\pset tablestyle ascii' in
their .psqlrc if they want the old format in a UTF-8 locale.

To follow up on the comments about the problems of defaulting to
UTF-8.  There are just two potential problems with defaulting, both of
which are problems with the user's mis-configuration of their system
and (IMHO) not something that postgresql needs to care about.
1) The user lacks a font containing the line-drawing characters.
   It's very rare for a fixed-width terminal font to not contain
   these characters, and the patch as provided sticks to the most
   basic range from the VT100 set which are most commonly provided.
2) The user's terminal emulator is not configured to accept UTF-8
   input.  If you're using a UTF-8 locale, then this is necessary
   to display non-ASCII characters, and is done automatically by
   almost every terminal emulator out there (on Linux, they default
   to using nl_langinfo(CODESET) unless configured otherwise, which
   is a very simple change if required).  On any current GNU/Linux
   system, you have to go out of your way to break the defaults.

The patch currently switches to UTF-8 automatically /when available/.
IMO this is the correct behaviour since it will work for all but the
handful of users who misconfigured their system, and provides an
immediate benefit.  We spent years making UTF-8 work out of the box on
Linux and Unix systems, and it seems a trifle unfair to penalise all
users for the sake of the few who just didn't set up their terminal
emulator correctly (their setup is already broken, since non-ASCII
characters returned by queries are /already/ going to be displayed
incorrectly).


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..ad297f8 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1689,6 +1689,28 @@ lo_import 152801
   /listitem
   /varlistentry
 
+  termliteraltablestyle/literal/term
+  listitem
+  para
+  Sets the style of text table output to one
+  of literalascii/literal, or literalutf8/literal.
+  Unique abbreviations are allowed.  (That would mean one
+  letter is enough.)  literalutf8/literal will be selected
+  by default if supported by your locale,
+  otherwise literalascii/literal will be used.
+  /para
+
+  para
+  quoteUTF8/quote use Unicode box drawing characters.
+  /para
+
+  para
+  quoteASCII/quote use plain ASCII characters.
+  /para
+
+  /listitem
+  /varlistentry
+
   varlistentry
   termliteralcolumns/literal/term
   listitem
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index da57eb4..e9b6fe0 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1772,6 +1772,24 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			printf(_(Output format is %s.\n), _align2string(popt-topt.format));
 	}
 
+	if (strcmp(param, tablestyle) == 0)
+	{
+		if (!value)
+			;
+		else if (pg_strncasecmp(ascii, value, vallen) == 0)
+			popt-topt.table_style = asciiformat;
+		else if (pg_strncasecmp(utf8, value, vallen) == 0)
+			popt-topt.table_style = utf8format;
+		else
+		{
+			psql_error(\\pset: allowed table styles are ascii, utf8\n);
+			return 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-10-02 Thread Brad T. Sliger
On Friday 02 October 2009 04:21:35 Roger Leigh wrote:
 On Wed, Sep 30, 2009 at 06:50:46PM -0400, Tom Lane wrote:
  Roger Leigh rle...@codelibre.net writes:
   On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
   Thinking about this some more, ISTM a much better way of approaching
   it would be to provide a flag for psql to turn off the fancy
   formatting, and have pg_regress use that flag.
  
   The attached patch implements this feature.  It adds a
   --no-pretty-formatting/-G option to psql (naming isn't my fort�,
   so feel free to change it!).  This is also documented in the
   SGML docs, and help text.  Lastly, this option is used when invoking
   psql by pg_regress, which results in a working testsuite in a UTF-8
   locale.
 
  It would be a good idea to tie this to a psql magic variable (like
  ON_ERROR_STOP) so that it could conveniently be set in ~/.psqlrc.
  I'm not actually sure that we need a dedicated command line switch
  for it, since you could use -v varname instead.

 I have attached a patch which implements the feature as a pset
 variable.  This also slightly simplifies some of the patch since
 the table style is passed to functions directly in printTableContent
 rather than separately.  The psql option '-P tablestyle=ascii' is
 passed to psql in pg_regress_main.c which means the testsuite doesn't
 fail any more.  The option is documented in the psql docs, and is
 also tab-completed.  Users can just put '\pset tablestyle ascii' in
 their .psqlrc if they want the old format in a UTF-8 locale.

 To follow up on the comments about the problems of defaulting to
 UTF-8.  There are just two potential problems with defaulting, both of
 which are problems with the user's mis-configuration of their system
 and (IMHO) not something that postgresql needs to care about.
 1) The user lacks a font containing the line-drawing characters.
It's very rare for a fixed-width terminal font to not contain
these characters, and the patch as provided sticks to the most
basic range from the VT100 set which are most commonly provided.
 2) The user's terminal emulator is not configured to accept UTF-8
input.  If you're using a UTF-8 locale, then this is necessary
to display non-ASCII characters, and is done automatically by
almost every terminal emulator out there (on Linux, they default
to using nl_langinfo(CODESET) unless configured otherwise, which
is a very simple change if required).  On any current GNU/Linux
system, you have to go out of your way to break the defaults.

 The patch currently switches to UTF-8 automatically /when available/.
 IMO this is the correct behaviour since it will work for all but the
 handful of users who misconfigured their system, and provides an
 immediate benefit.  We spent years making UTF-8 work out of the box on
 Linux and Unix systems, and it seems a trifle unfair to penalise all
 users for the sake of the few who just didn't set up their terminal
 emulator correctly (their setup is already broken, since non-ASCII
 characters returned by queries are /already/ going to be displayed
 incorrectly).


 Regards,
 Roger

I looked at psql-utf8-table-5.patch.

Lint(1) says there is an extra trailing ',' in src/bin/psql/print.h. in 
'typedef enum printTextRule'.  The addition to 
src/bin/psql/command.c could use a comment, like adjacent code.

'ASCII' and 'UTF8' may need acronym/acronym tags in 
doc/src/sgml/ref/psql-ref.sgml, like adjacent 
code.  I'm not sure someone who hasn't seen this patch in action would 
immediately know what it does from the 
documentation.  `gmake html` works without the patch, but fails with the patch:

openjade:ref/psql-ref.sgml:1692:15:E: document type does not allow 
element TERM here; assuming 
missing VARLISTENTRY start-tag

After the patch, `\pset format wrapped` produces '\pset: unknown 
option: format'.  I saw this in interactive psql 
and from .psqlrc.  I think this can be fixed by changing the addition to 
src/bin/psql/command.c from an 'if' clause to 
an 'else if' clause.

Otherwise, the patch applied, built and installed.  The `gmake check` 
tests all passed with LANG and/or LC_ALL 
set.  The various tablestyle options seem to work.  The default behavior with 
respect to various LANG and LC_ALL 
values seems reasonable and can be overridden.

Thanks,

--bts

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Roger Leigh
On Tue, Sep 29, 2009 at 04:28:57PM -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote:
  The bigger question is exactly how we expect this stuff to interact with
  pg_regress' --no-locale switch.  We already do clear all these variables
  when --no-locale is specified.  I am wondering just what --locale is
  supposed to do, and whether selectively lobotomizing the LC stuff has
  any real use at all.
 
  We should do the LANG or LC_CTYPE thing only on the client,
  unconditionally.  The --no-locale/--locale options should primarily
  determine what the temporary server uses.
 
 Well, that seems fairly reasonable, but it's going to require some
 refactoring of pg_regress.  The initialize_environment function
 determines what happens in both the client and the temp server.

Two possible approaches to fix the tests are as follows:

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index f2f9603..74cdaa2 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -711,8 +711,7 @@ initialize_environment(void)
 * is actually called.)
 */
unsetenv(LANGUAGE);
-   unsetenv(LC_ALL);
-   putenv(LC_MESSAGES=C);
+   putenv(LC_ALL=C);
 
/*
 * Set multibyte as requested

Here we just force the locale to C.  This does have the disadvantage
that --no-locale is made redundant, and any tests which are dependent
upon locale (if any?) will be run in the C locale.

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index f2f9603..65fb49a 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -712,6 +712,7 @@ initialize_environment(void)
 */
unsetenv(LANGUAGE);
unsetenv(LC_ALL);
+   putenv(LC_CTYPE=C);
putenv(LC_MESSAGES=C);
 
/*

Here we set LC_CTYPE to C in addition to LC_MESSAGES (and for much the
same reasons).  However, if you test on non-C locales to check for
issues with other locale codesets, those tests are all going to be
forced to use ASCII.  Is this a problem in practice?

From the POV of my patch, it's working as designed: if the locale
codeset is UTF-8 it's outputting UTF-8.  But, due to the way the
test machinery is looking at the output, this is breaking the tests.
I'm not sure what I can do with my patch to make it behave differently
that is both compatible with its intended use and not break the tests--
this is really just breaking an assumption in the testing code that
the test output will always be ASCII.

Forcing the LC_CTYPE to C will force ASCII output and work around this
problem with the tests.  Another approach would be to let psql know
it's being run in a test environment with a PG_TEST or some other
environment variable which we can check for and use to turn off UTF-8
output if set.  Would that be better?


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan



Roger Leigh wrote:

Here we just force the locale to C.  This does have the disadvantage
that --no-locale is made redundant, and any tests which are dependent
upon locale (if any?) will be run in the C locale.


  


That is not a solution. We have not that long ago gone to some lengths 
to provide for buildfarm testing in various locales. We're not going to 
undo that.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Roger Leigh wrote:
 Here we just force the locale to C.  This does have the disadvantage
 that --no-locale is made redundant, and any tests which are dependent
 upon locale (if any?) will be run in the C locale.

 That is not a solution.

Right.  I think you may have missed the point of what Peter was saying:
it's okay to force the locale to C on the *client* side of the tests.
The trick is to not change the environment that the temp server is
started with.  This will take some redesign inside pg_regress, I think.
Probably initialize_environment needs to be split into two functions,
one that's run before firing off the temp server and one that runs
after.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Roger Leigh
On Tue, Sep 29, 2009 at 04:32:49PM -0400, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  C locale means POSIX behavior and nothing but.
 
  Indeed it does.  However, making LC_CTYPE be UTF-8 rather than
  ASCII is both possible and still strictly conforming to the
  letter of the standard.  There would be some collation and
  other restrictions (digit and other character classes would
  be contrained to the ASCII characters compared with other UTF-8
  locales). However, any existing programs using ASCII would continue
  to function without any changes to their behaviour.  The only
  observable change will be that nl_langinfo(CODESET) will return
  UTF-8, and it will be valid for programs to use UTF-8 encoded
  text in formatted print functions, etc..
 
 I really, really don't believe that that meets either the letter or
 the spirit of the C standard, at least not if you are intending to
 silently substitute LC_CTYPE=UTF8 when the program has specified
 C/POSIX locale.  (If this is just a matter of what the default
 LANG environment is, of course you can do anything.)

We have spent some time reading the relevant standards documents
(C, POSIX, SUSv2, SUSv3) and haven't found anything yet that would
preclude this.  While they all specify minimal requirements for
what the C locale character set must provide (and POSIX/SUS are the
most strict, specifying ASCII outright for each 0-127 codepoint),
these are the minimal requirements for the locale, and
implementation-specific extensions to ASCII are allowed, which would
therefore permit UTF-8.  Note that LC_CTYPE=C is not required to
specify ASCII in any standard (though POSIX/SUS require that it
must contain ASCII as a subset of the whole set).

The language in SUSv2 in fact explicitly states that this is
allowed.  In fact, I've seen documentation that some UNIX systems such
as HPUX already do have a UTF-8 C locale as an option.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan



Andrew Dunstan wrote:



Roger Leigh wrote:

Here we just force the locale to C.  This does have the disadvantage
that --no-locale is made redundant, and any tests which are dependent
upon locale (if any?) will be run in the C locale.


  


That is not a solution. We have not that long ago gone to some lengths 
to provide for buildfarm testing in various locales. We're not going 
to undo that.





Thinking about this some more, ISTM a much better way of approaching it 
would be to provide a flag for psql to turn off the fancy formatting, 
and have pg_regress use that flag.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 The language in SUSv2 in fact explicitly states that this is
 allowed.  In fact, I've seen documentation that some UNIX systems such
 as HPUX already do have a UTF-8 C locale as an option.

I don't argue with the concept of a C.UTF8 locale --- in fact I think
it sounds pretty useful.  What I think is 100% broken is trying to make
C locale work that way.  C locale is supposed to be the traditional
locale-ignorant, characters-are-bytes behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Thinking about this some more, ISTM a much better way of approaching it 
 would be to provide a flag for psql to turn off the fancy formatting, 
 and have pg_regress use that flag.

Yeah, that's not a bad idea.  There are likely to be other client
programs that won't want this behavioral change either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Robert Haas
On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 Thinking about this some more, ISTM a much better way of approaching it
 would be to provide a flag for psql to turn off the fancy formatting,
 and have pg_regress use that flag.

 Yeah, that's not a bad idea.  There are likely to be other client
 programs that won't want this behavioral change either.

I'm surprised there isn't one already.  I would think that any new
format would default to off.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan



Robert Haas wrote:

On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  

Andrew Dunstan and...@dunslane.net writes:


Thinking about this some more, ISTM a much better way of approaching it
would be to provide a flag for psql to turn off the fancy formatting,
and have pg_regress use that flag.
  

Yeah, that's not a bad idea.  There are likely to be other client
programs that won't want this behavioral change either.



I'm surprised there isn't one already.  I would think that any new
format would default to off.


  


I haven't looked, but if there is that's so much less work to do ;-)

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
 Thinking about this some more, ISTM a much better way of approaching it 
 would be to provide a flag for psql to turn off the fancy formatting, 
 and have pg_regress use that flag.

Well, it might not be a bad idea, but adding a feature just to satisfy
the test suite instead of fixing the test suite doesn't feel satisfying.
Is there another use case?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 12:06 -0400, Robert Haas wrote:
 On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Andrew Dunstan and...@dunslane.net writes:
  Thinking about this some more, ISTM a much better way of approaching it
  would be to provide a flag for psql to turn off the fancy formatting,
  and have pg_regress use that flag.
 
  Yeah, that's not a bad idea.  There are likely to be other client
  programs that won't want this behavioral change either.
 
 I'm surprised there isn't one already.  I would think that any new
 format would default to off.

Well, this isn't a new format, just an enhancement of an existing
format.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Andrew Dunstan



Peter Eisentraut wrote:

On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
  
Thinking about this some more, ISTM a much better way of approaching it 
would be to provide a flag for psql to turn off the fancy formatting, 
and have pg_regress use that flag.



Well, it might not be a bad idea, but adding a feature just to satisfy
the test suite instead of fixing the test suite doesn't feel satisfying.
Is there another use case?

  


Sure, as Tom noted pg_regress probably won't be the only user. There are 
lots of legacy scripts out there that parse psql output, and it should 
be of use to them.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Alvaro Herrera
Andrew Dunstan escribió:
 
 
 Peter Eisentraut wrote:
 On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
 Thinking about this some more, ISTM a much better way of
 approaching it would be to provide a flag for psql to turn off
 the fancy formatting, and have pg_regress use that flag.
 
 Well, it might not be a bad idea, but adding a feature just to satisfy
 the test suite instead of fixing the test suite doesn't feel satisfying.
 Is there another use case?
 
 Sure, as Tom noted pg_regress probably won't be the only user. There
 are lots of legacy scripts out there that parse psql output, and it
 should be of use to them.

All scripts I've seen parsing psql output use unaligned, undecorated
mode.  I have yet to see one messing with the |'s.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Robert Haas
On Wed, Sep 30, 2009 at 1:27 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Wed, 2009-09-30 at 12:06 -0400, Robert Haas wrote:
 On Wed, Sep 30, 2009 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Andrew Dunstan and...@dunslane.net writes:
  Thinking about this some more, ISTM a much better way of approaching it
  would be to provide a flag for psql to turn off the fancy formatting,
  and have pg_regress use that flag.
 
  Yeah, that's not a bad idea.  There are likely to be other client
  programs that won't want this behavioral change either.

 I'm surprised there isn't one already.  I would think that any new
 format would default to off.

 Well, this isn't a new format, just an enhancement of an existing
 format.

I disagree.  This patch is about replacing ASCII art constructed using
| and + and replacing it with box-drawing characters.  I think the
chances that this will not look good in one or more of the terminal
emulators that I use from time to time are excellent, especially if
the approach is at all similar to what pstree does on recent versions
of Fedora.  I'm 100% not interested in tracking down some obscure
encoding/locale issue to make it show up right, and I'm 110% not
interested in helping the people who file bug reports to track down
the analagous issues in their crazy, broken environments.

The way it works right now (and has worked for the last 5 years or
more) is reliable, familiar, and, at least IME, bullet-proof.  I don't
even see a good case for changing the default, let alone not providing
a way to retreat to the old behavior.  Considering that this patch is
nothing more than an exceedingly minor piece of window-dressing,
taking even a small risk of breaking things for anyone who as of today
can log into psql and get work done without pain seems like a bad
trade-off to me.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Roger Leigh
On Wed, Sep 30, 2009 at 01:30:17PM -0400, Andrew Dunstan wrote:


 Peter Eisentraut wrote:
 On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
   
 Thinking about this some more, ISTM a much better way of approaching 
 it would be to provide a flag for psql to turn off the fancy 
 formatting, and have pg_regress use that flag.
 

 Well, it might not be a bad idea, but adding a feature just to satisfy
 the test suite instead of fixing the test suite doesn't feel satisfying.
 Is there another use case?

 Sure, as Tom noted pg_regress probably won't be the only user. There are  
 lots of legacy scripts out there that parse psql output, and it should  
 be of use to them.

The attached patch implements this feature.  It adds a
--no-pretty-formatting/-G option to psql (naming isn't my forté,
so feel free to change it!).  This is also documented in the
SGML docs, and help text.  Lastly, this option is used when invoking
psql by pg_regress, which results in a working testsuite in a UTF-8
locale.

Hopefully this is OK!  I'll be happy to make any amendments required.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 85e9375..95f1365 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -192,6 +192,19 @@ PostgreSQL documentation
 /varlistentry
 
 varlistentry
+  termoption-G/option/term
+  termoption--no-pretty-formatting/option/term
+  para
+  Only use ASCII characters to draw tables. Without this option,
+  box drawing characters may be used if avilable which will result
+  in nicer output.  However, if compatibility with the output of
+  older psql versions, or psql used in other locales is required, it
+  may be desirable to disable pretty table formatting in these
+  situations.
+  /para
+/varlistentry
+
+varlistentry
   termoption-h replaceable class=parameterhostname/replaceable//term
   termoption--host replaceable class=parameterhostname/replaceable//term
   listitem
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index d498d58..d596ece 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -121,6 +121,8 @@ usage(void)
 	  set field separator (default: \%s\)\n),
 		   DEFAULT_FIELD_SEP);
 	printf(_(  -H, --html   HTML table output mode\n));
+	printf(_(  -G, --no-pretty-formatting\n
+	   only use ASCII characters to format tables\n));
 	printf(_(  -P, --pset=VAR[=ARG] set printing option VAR to ARG (see \\pset command)\n));
 	printf(_(  -R, --record-separator=STRING\n
 	   set record separator (default: newline)\n));
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 41d508e..e01e512 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -2290,9 +2290,10 @@ printTable(const printTableContent *cont, FILE *fout, FILE *flog)
 	const printTextFormat	*text_format = asciiformat;
 
 #if (defined(HAVE_LANGINFO_H)  defined(CODESET))
-	if (pg_strcasecmp(nl_langinfo(CODESET), UTF-8) == 0 ||
-	pg_strcasecmp(nl_langinfo(CODESET), utf8) == 0 ||
-	pg_strcasecmp(nl_langinfo(CODESET), CP65001) == 0)
+	if ((pg_strcasecmp(nl_langinfo(CODESET), UTF-8) == 0 ||
+	 pg_strcasecmp(nl_langinfo(CODESET), utf8) == 0 ||
+	 pg_strcasecmp(nl_langinfo(CODESET), CP65001) == 0) 
+	cont-opt-use_text_format)
 		text_format = utf8format;
 #endif
 
diff --git a/src/bin/psql/print.h b/src/bin/psql/print.h
index ea1dc97..97772a6 100644
--- a/src/bin/psql/print.h
+++ b/src/bin/psql/print.h
@@ -27,6 +27,7 @@ enum printFormat
 typedef struct printTableOpt
 {
 	enum printFormat format;	/* one of the above */
+	bool		use_text_format; /* use text formats for pretty printing */
 	bool		expanded;		/* expanded/vertical output (if supported by
  * output format) */
 	unsigned short int border;	/* Print a border around the table. 0=none,
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 429e5d9..3fdf904 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -125,6 +125,7 @@ main(int argc, char *argv[])
 
 	/* We rely on unmentioned fields of pset.popt to start out 0/false/NULL */
 	pset.popt.topt.format = PRINT_ALIGNED;
+	pset.popt.topt.use_text_format = 1;
 	pset.popt.topt.border = 1;
 	pset.popt.topt.pager = 1;
 	pset.popt.topt.start_table = true;
@@ -317,6 +318,7 @@ parse_psql_options(int argc, char *argv[], struct adhoc_opts * options)
 		{list, no_argument, NULL, 'l'},
 		{log-file, required_argument, NULL, 'L'},
 		{no-readline, no_argument, NULL, 'n'},
+		{no-pretty-formatting, no_argument, NULL, 'G'},
 		{single-transaction, no_argument, NULL, '1'},
 		{output, required_argument, 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 14:02 -0400, Alvaro Herrera wrote:
 All scripts I've seen parsing psql output use unaligned, undecorated
 mode.  I have yet to see one messing with the |'s.

Plus, we would have broken that with the : continuation lines.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 The way it works right now (and has worked for the last 5 years or
 more) is reliable, familiar, and, at least IME, bullet-proof.  I don't
 even see a good case for changing the default, let alone not providing
 a way to retreat to the old behavior.

This is pretty much my opinion as well.  I'd vote for requiring a switch
to turn it on.  Those who really want the new behavior by default can
set it up in their .psqlrc.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On Wed, 2009-09-30 at 14:02 -0400, Alvaro Herrera wrote:
 All scripts I've seen parsing psql output use unaligned, undecorated
 mode.  I have yet to see one messing with the |'s.

 Plus, we would have broken that with the : continuation lines.

Only for data containing embedded newlines, which is not that common.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
 Thinking about this some more, ISTM a much better way of approaching 
 it would be to provide a flag for psql to turn off the fancy 
 formatting, and have pg_regress use that flag.

 The attached patch implements this feature.  It adds a
 --no-pretty-formatting/-G option to psql (naming isn't my forté,
 so feel free to change it!).  This is also documented in the
 SGML docs, and help text.  Lastly, this option is used when invoking
 psql by pg_regress, which results in a working testsuite in a UTF-8
 locale.

It would be a good idea to tie this to a psql magic variable (like
ON_ERROR_STOP) so that it could conveniently be set in ~/.psqlrc.
I'm not actually sure that we need a dedicated command line switch
for it, since you could use -v varname instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-30 Thread Peter Eisentraut
On Wed, 2009-09-30 at 18:50 -0400, Tom Lane wrote:
 It would be a good idea to tie this to a psql magic variable (like
 ON_ERROR_STOP) so that it could conveniently be set in ~/.psqlrc.
 I'm not actually sure that we need a dedicated command line switch
 for it, since you could use -v varname instead.

Right.  A better yet place would be inside \pset, since that contains
the other formatting options.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Peter Eisentraut
On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote:
 During this review I found that `gmake check` will fail when
 LANG=en_US.UTF-8 in the environment.  In this case 
 the patched psql produces UTF8 line art and the tests expect ASCII
 line art.
 
 pg_regress clears LC_ALL by default, but does not clear LANG
 by default.  Please find attached a patch that 
 causes pg_regress to also clear LANG by default.

It probably doesn't matter much, but I think the proper fix would be to
unset LC_CTYPE.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote:
 pg_regress clears LC_ALL by default, but does not clear LANG
 by default.  Please find attached a patch that 
 causes pg_regress to also clear LANG by default.

 It probably doesn't matter much, but I think the proper fix would be to
 unset LC_CTYPE.

Wouldn't we have to clear *all* of these variables to be sure?

The bigger question is exactly how we expect this stuff to interact with
pg_regress' --no-locale switch.  We already do clear all these variables
when --no-locale is specified.  I am wondering just what --locale is
supposed to do, and whether selectively lobotomizing the LC stuff has
any real use at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Roger Leigh
On Tue, Sep 29, 2009 at 12:01:30PM -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote:
  pg_regress clears LC_ALL by default, but does not clear LANG
  by default.  Please find attached a patch that 
  causes pg_regress to also clear LANG by default.
 
  It probably doesn't matter much, but I think the proper fix would be to
  unset LC_CTYPE.
 
 Wouldn't we have to clear *all* of these variables to be sure?
 
 The bigger question is exactly how we expect this stuff to interact with
 pg_regress' --no-locale switch.  We already do clear all these variables
 when --no-locale is specified.  I am wondering just what --locale is
 supposed to do, and whether selectively lobotomizing the LC stuff has
 any real use at all.

While this is currently just hypothetical, the code is also making the
assumption that the C locale is ASCII.  However, there is no guarantee
that this is the case.  ASCII is the minimal requirement, and UTF-8 is
a superset (according to SUS/POSIX).

In Debian, we do have plans to introduce a C.UTF-8 locale, and perhaps
eventually move the C locale to using UTF-8 by default, at which point
the tests would again fail because you would still get the UTF-8
formatting despite all of the LC_ etc. munging.  (At that point, we
would have a C.ASCII for compatibility.)


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 In Debian, we do have plans to introduce a C.UTF-8 locale,

Egad, isn't that a contradiction in terms?  C locale means POSIX
behavior and nothing but.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Roger Leigh
On Tue, Sep 29, 2009 at 01:41:27PM -0400, Tom Lane wrote:
 Roger Leigh rle...@codelibre.net writes:
  In Debian, we do have plans to introduce a C.UTF-8 locale,
 
 Egad, isn't that a contradiction in terms?

Not entirely!

 C locale means POSIX behavior and nothing but.

Indeed it does.  However, making LC_CTYPE be UTF-8 rather than
ASCII is both possible and still strictly conforming to the
letter of the standard.  There would be some collation and
other restrictions (digit and other character classes would
be contrained to the ASCII characters compared with other UTF-8
locales). However, any existing programs using ASCII would continue
to function without any changes to their behaviour.  The only
observable change will be that nl_langinfo(CODESET) will return
UTF-8, and it will be valid for programs to use UTF-8 encoded
text in formatted print functions, etc..

I'd just like to stress that this isn't happening anytime soon!
(Well, C.UTF-8 is possible now, it's just not hardcoded into
libc.so.  The intention is to initially provide C.UTF-8 in
addition to C so that programs requiring UTF-8 support have a
locale guaranteed to be present on the system to meet their
needs.)


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Peter Eisentraut
On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On Mon, 2009-09-28 at 20:49 -0700, Brad T. Sliger wrote:
  pg_regress clears LC_ALL by default, but does not clear LANG
  by default.  Please find attached a patch that 
  causes pg_regress to also clear LANG by default.
 
  It probably doesn't matter much, but I think the proper fix would be to
  unset LC_CTYPE.

(Actually, I should have said, set LC_CTYPE to C.)

 Wouldn't we have to clear *all* of these variables to be sure?

Not really.  The problem here is that psql checks whether the console
uses UTF8, and that is controlled by the LC_CTYPE variable on Unix-ish
systems.

 The bigger question is exactly how we expect this stuff to interact with
 pg_regress' --no-locale switch.  We already do clear all these variables
 when --no-locale is specified.  I am wondering just what --locale is
 supposed to do, and whether selectively lobotomizing the LC stuff has
 any real use at all.

We should do the LANG or LC_CTYPE thing only on the client,
unconditionally.  The --no-locale/--locale options should primarily
determine what the temporary server uses.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote:
 The bigger question is exactly how we expect this stuff to interact with
 pg_regress' --no-locale switch.  We already do clear all these variables
 when --no-locale is specified.  I am wondering just what --locale is
 supposed to do, and whether selectively lobotomizing the LC stuff has
 any real use at all.

 We should do the LANG or LC_CTYPE thing only on the client,
 unconditionally.  The --no-locale/--locale options should primarily
 determine what the temporary server uses.

Well, that seems fairly reasonable, but it's going to require some
refactoring of pg_regress.  The initialize_environment function
determines what happens in both the client and the temp server.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Roger Leigh rle...@codelibre.net writes:
 C locale means POSIX behavior and nothing but.

 Indeed it does.  However, making LC_CTYPE be UTF-8 rather than
 ASCII is both possible and still strictly conforming to the
 letter of the standard.  There would be some collation and
 other restrictions (digit and other character classes would
 be contrained to the ASCII characters compared with other UTF-8
 locales). However, any existing programs using ASCII would continue
 to function without any changes to their behaviour.  The only
 observable change will be that nl_langinfo(CODESET) will return
 UTF-8, and it will be valid for programs to use UTF-8 encoded
 text in formatted print functions, etc..

I really, really don't believe that that meets either the letter or
the spirit of the C standard, at least not if you are intending to
silently substitute LC_CTYPE=UTF8 when the program has specified
C/POSIX locale.  (If this is just a matter of what the default
LANG environment is, of course you can do anything.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Robert Haas
On Tue, Sep 29, 2009 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote:
 The bigger question is exactly how we expect this stuff to interact with
 pg_regress' --no-locale switch.  We already do clear all these variables
 when --no-locale is specified.  I am wondering just what --locale is
 supposed to do, and whether selectively lobotomizing the LC stuff has
 any real use at all.

 We should do the LANG or LC_CTYPE thing only on the client,
 unconditionally.  The --no-locale/--locale options should primarily
 determine what the temporary server uses.

 Well, that seems fairly reasonable, but it's going to require some
 refactoring of pg_regress.  The initialize_environment function
 determines what happens in both the client and the temp server.

This seems to mean that we can't apply this patch, since failing the
regression tests is not an acceptable behavior.  I think that means
that the patch author needs to either do the necessary pg_regress
refactoring or figure out some other solution that is acceptable to
the community.   Assuming that non-trivial pg_regress refactoring is
required, I think we should mark this patch as Returned with Feedback,
because that should really be a separate patch, and it's obviously far
too late to submit new patches to this CommitFest.

Objections?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Alvaro Herrera
Robert Haas escribió:
 On Tue, Sep 29, 2009 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Peter Eisentraut pete...@gmx.net writes:
  On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote:
  The bigger question is exactly how we expect this stuff to interact with
  pg_regress' --no-locale switch.  We already do clear all these variables
  when --no-locale is specified.  I am wondering just what --locale is
  supposed to do, and whether selectively lobotomizing the LC stuff has
  any real use at all.
 
  We should do the LANG or LC_CTYPE thing only on the client,
  unconditionally.  The --no-locale/--locale options should primarily
  determine what the temporary server uses.
 
  Well, that seems fairly reasonable, but it's going to require some
  refactoring of pg_regress.  The initialize_environment function
  determines what happens in both the client and the temp server.
 
 This seems to mean that we can't apply this patch, since failing the
 regression tests is not an acceptable behavior.  I think that means
 that the patch author needs to either do the necessary pg_regress
 refactoring or figure out some other solution that is acceptable to
 the community.   Assuming that non-trivial pg_regress refactoring is
 required, I think we should mark this patch as Returned with Feedback,
 because that should really be a separate patch, and it's obviously far
 too late to submit new patches to this CommitFest.
 
 Objections?

Does the patch pass regression tests in normal conditions?  If it does,
I see no reason to reject it.  If it fails in --locale only, and even
then only when the given locale is UTF8, which IIRC it's a seldom-used
case, we can see about fixing that separately.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Robert Haas escribió:
 This seems to mean that we can't apply this patch, since failing the
 regression tests is not an acceptable behavior.

 Does the patch pass regression tests in normal conditions?

If you consider that normal means LANG=C in environment, then yeah.
Else not so much.  In particular, every one of the buildfarm machines
that tests a non-C locale environment can be expected to go red.

I think that the required patch might not be a whole lot larger than the
one Brad already submitted, so I'm willing to allow a couple more days
for the author to investigate the issue.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Andrew Dunstan



Alvaro Herrera wrote:

Does the patch pass regression tests in normal conditions?  If it does,
I see no reason to reject it.  If it fails in --locale only, and even
then only when the given locale is UTF8, which IIRC it's a seldom-used
case, we can see about fixing that separately.

  


Numerous buildfarm machines (including one of mine) run the regression 
suite using UTF8. So it's far from seldom-used.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-29 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Robert Haas escribi�:
  This seems to mean that we can't apply this patch, since failing the
  regression tests is not an acceptable behavior.
 
  Does the patch pass regression tests in normal conditions?
 
 If you consider that normal means LANG=C in environment, then yeah.
 Else not so much.  In particular, every one of the buildfarm machines
 that tests a non-C locale environment can be expected to go red.

Huh, you're right, it doesn't work at all.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-28 Thread Brad T. Sliger
On Sunday 27 September 2009 19:03:33 Robert Haas wrote:
 On Sun, Sep 27, 2009 at 9:24 PM, Selena Deckelmann

 selenama...@gmail.com wrote:
  Hi!
 
  On Wed, Sep 23, 2009 at 2:16 AM, Roger Leigh rle...@codelibre.net wrote:
  On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote:
  Brad says:
 
         The patched code compiles without any additional warnings.
  Lint gripes about a trailing ',' in 'typedef enum printTextRule' in
  print.h.  Other additional lint seem to be false positives.  The
  regression tests pass against the new patch.
 
  I've attached a new patch which tidies up those extra commas, plus
  a patch showing the changes from the previous patch.
 
  Great!  Thank you.
 
  Brad -- can you review this patch? Is it ready for a committer?

 Brad already marked it that way on the CommitFest application, so I
 think that probably means yes.  :-)

 ...Robert

I looked at the new (psql-utf8-table-4.patch) patch.  I think the style 
issues are fixed.

During this review I found that `gmake check` will fail when 
LANG=en_US.UTF-8 in the environment.  In this case 
the patched psql produces UTF8 line art and the tests expect ASCII line art.

pg_regress clears LC_ALL by default, but does not clear LANG by 
default.  Please find attached a patch that 
causes pg_regress to also clear LANG by default.

Thanks,

--bts

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index f2f9603..68ba30b 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -697,11 +697,6 @@ initialize_environment(void)
 		unsetenv(LC_MONETARY);
 		unsetenv(LC_NUMERIC);
 		unsetenv(LC_TIME);
-		unsetenv(LANG);
-		/* On Windows the default locale cannot be English, so force it */
-#if defined(WIN32) || defined(__CYGWIN__)
-		putenv(LANG=en);
-#endif
 	}
 
 	/*
@@ -712,8 +707,14 @@ initialize_environment(void)
 	 */
 	unsetenv(LANGUAGE);
 	unsetenv(LC_ALL);
+	unsetenv(LANG);
+	/* On Windows the default locale cannot be English, so force it */
+#if defined(WIN32) || defined(__CYGWIN__)
+	putenv(LANG=en);
+#endif
 	putenv(LC_MESSAGES=C);
 
+
 	/*
 	 * Set multibyte as requested
 	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-27 Thread Selena Deckelmann
Hi!

On Wed, Sep 23, 2009 at 2:16 AM, Roger Leigh rle...@codelibre.net wrote:
 On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote:
 Brad says:

        The patched code compiles without any additional warnings.
 Lint gripes about a trailing ',' in 'typedef enum printTextRule' in
 print.h.  Other additional lint seem to be false positives.  The
 regression tests pass against the new patch.

 I've attached a new patch which tidies up those extra commas, plus
 a patch showing the changes from the previous patch.

Great!  Thank you.

Brad -- can you review this patch? Is it ready for a committer?

Thanks,
-selena


-- 
http://chesnok.com/daily - me
http://endpoint.com - work

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-27 Thread Robert Haas
On Sun, Sep 27, 2009 at 9:24 PM, Selena Deckelmann
selenama...@gmail.com wrote:
 Hi!

 On Wed, Sep 23, 2009 at 2:16 AM, Roger Leigh rle...@codelibre.net wrote:
 On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote:
 Brad says:

        The patched code compiles without any additional warnings.
 Lint gripes about a trailing ',' in 'typedef enum printTextRule' in
 print.h.  Other additional lint seem to be false positives.  The
 regression tests pass against the new patch.

 I've attached a new patch which tidies up those extra commas, plus
 a patch showing the changes from the previous patch.

 Great!  Thank you.

 Brad -- can you review this patch? Is it ready for a committer?

Brad already marked it that way on the CommitFest application, so I
think that probably means yes.  :-)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-23 Thread Roger Leigh
On Fri, Sep 18, 2009 at 11:30:05AM -0700, Selena Deckelmann wrote:
 Brad says:
 
The patched code compiles without any additional warnings.
 Lint gripes about a trailing ',' in 'typedef enum printTextRule' in
 print.h.  Other additional lint seem to be false positives.  The
 regression tests pass against the new patch.

I've attached a new patch which tidies up those extra commas, plus
a patch showing the changes from the previous patch.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 7505cd4..41d508e 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -21,6 +21,9 @@
 #endif
 
 #include locale.h
+#ifdef HAVE_LANGINFO_H
+#include langinfo.h
+#endif
 
 #include catalog/pg_type.h
 #include pqsignal.h
@@ -356,38 +359,75 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 /* Aligned text		*/
 //
 
+static const printTextFormat asciiformat =
+{
+	{
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ ,  |, |, | }
+	},
+	:,
+	;,
+	 
+};
+
+static const struct printTextFormat utf8format =
+{
+	{
+	  /* ─, ┌, ┬, ┐ */
+	  { \342\224\200, \342\224\214, \342\224\254, \342\224\220 },
+	  /* ━, ┝, ┿, ┥ */
+	  { \342\224\201, \342\224\235, \342\224\277, \342\224\245 },
+	  /* ─, └, ┴, ┘ */
+	  { \342\224\200, \342\224\224, \342\224\264, \342\224\230 },
+	  /* N/A, │, │, │ */
+	  { , \342\224\202, \342\224\202, \342\224\202 }
+	},
+	  /* ╎ */
+	  \342\225\216,
+	  /* ┊ */
+	  \342\224\212,
+	  /* ╷ */
+	  \342\225\267
+};
 
 /* draw line */
 static void
 _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
-	   unsigned short border, FILE *fout)
+		   unsigned short border, printTextRule pos,
+		   const printTextFormat *format,
+		   FILE *fout)
 {
 	unsigned int i,
 j;
 
+	const printTextLineFormat *lformat = format-lrule[pos];
+
 	if (border == 1)
-		fputc('-', fout);
+		fputs(lformat-hrule, fout);
 	else if (border == 2)
-		fputs(+-, fout);
+		fprintf(fout, %s%s, lformat-leftvrule, lformat-hrule);
 
 	for (i = 0; i  ncolumns; i++)
 	{
 		for (j = 0; j  widths[i]; j++)
-			fputc('-', fout);
+			fputs(lformat-hrule, fout);
 
 		if (i  ncolumns - 1)
 		{
 			if (border == 0)
 fputc(' ', fout);
 			else
-fputs(-+-, fout);
+fprintf(fout, %s%s%s, lformat-hrule,
+	lformat-midvrule, lformat-hrule);
 		}
 	}
 
 	if (border == 2)
-		fputs(-+, fout);
+		fprintf(fout, %s%s, lformat-hrule, lformat-rightvrule);
 	else if (border == 1)
-		fputc('-', fout);
+		fputs(lformat-hrule, fout);
 
 	fputc('\n', fout);
 }
@@ -397,7 +437,8 @@ _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
  *	Print pretty boxes around cells.
  */
 static void
-print_aligned_text(const printTableContent *cont, FILE *fout)
+print_aligned_text(const printTableContent *cont, const printTextFormat *format,
+		   FILE *fout)
 {
 	bool		opt_tuples_only = cont-opt-tuples_only;
 	bool		opt_numeric_locale = cont-opt-numericLocale;
@@ -431,6 +472,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	int		   *bytes_output;	/* Bytes output for column value */
 	int			output_columns = 0;		/* Width of interactive console */
 	bool		is_pager = false;
+	const printTextLineFormat	*dformat = format-lrule[PRINT_RULE_DATA];
 
 	if (cancel_pressed)
 		return;
@@ -709,7 +751,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			int			curr_nl_line;
 
 			if (opt_border == 2)
-_print_horizontal_line(col_count, width_wrap, opt_border, fout);
+_print_horizontal_line(col_count, width_wrap, opt_border, PRINT_RULE_TOP, format, fout);
 
 			for (i = 0; i  col_count; i++)
 pg_wcsformat((unsigned char *) cont-headers[i],
@@ -722,7 +764,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			while (more_col_wrapping)
 			{
 if (opt_border == 2)
-	fprintf(fout, |%c, curr_nl_line ? '+' : ' ');
+	fprintf(fout, %s%c, dformat-leftvrule, curr_nl_line ? '+' : ' ');
 else if (opt_border == 1)
 	fputc(curr_nl_line ? '+' : ' ', fout);
 
@@ -753,19 +795,20 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		if (opt_border == 0)
 			fputc(curr_nl_line ? '+' : ' ', fout);
 		else
-			fprintf(fout,  |%c, curr_nl_line ? '+' : ' ');
+			fprintf(fout,  %s%c, dformat-midvrule, curr_nl_line ? '+' : ' ');
 	}
 }
 curr_nl_line++;
 
 if (opt_border == 2)
-	fputs( |, fout);
+	fprintf(fout,  %s,
+		dformat-rightvrule);
 else if (opt_border == 1)
 	fputc(' ', fout);
 fputc('\n', fout);
 			}
 
-			_print_horizontal_line(col_count, width_wrap, opt_border, fout);
+			_print_horizontal_line(col_count, width_wrap, opt_border, PRINT_RULE_MIDDLE, 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-09-18 Thread Selena Deckelmann
Hi!

Below is from Brad Slinger.  I'm just forwarding his message so that
the review will be in the thread for the archives. (Sorry, Brad that I
missed this earlier.. I thought you'd already replied to the list for
some reason.)

Brad says:

Please forgive my top posting.  Below is my patch review.

   The purpose of this patch is to improve psql output
readability.   This is accomplished through table formatting
improvements and line art improvements.

   The patch is a unified diff, rather than a context diff.  The
patch applies from the project root with with -p1 set.  Patch(1)
reports with fuzz 1 when patching src/bin/psql/print.h.  The patch
touches two files:

   src/bin/psql/print.h
   src/bin/psql/print.c

   The patch does not include any documentation or tests.
Comments and style are similar to existing code.

   The patched code compiles without any additional warnings.
Lint gripes about a trailing ',' in 'typedef enum printTextRule' in
print.h.  Other additional lint seem to be false positives.  The
regression tests pass against the new patch.

   The patched code installs and runs.  There were no crashes,
although I didn't try very hard.  The ASCII line art works:

   createdb test
   psql test
   \pset border 2
   \pset format wrapped
   SELECT 'short' AS short, lpad('', 1000, 'long') AS long;

   I can trigger the new line art by setting LC_ALL=en_US.UTF-8 in
the environment.

   As far as I can tell, the patch works as advertised.  The patch
does not change the server/backend.  I don't see any performance
problems.


   Thanks,

   --bts

-- 
http://chesnok.com/daily - me
http://endpoint.com - work

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-27 Thread Roger Leigh
On Tue, Aug 25, 2009 at 10:32:50PM -0400, Alvaro Herrera wrote:
 Roger Leigh escribió:
 
  An updated copy of the patch is attached.
 
 Did you give expanded output a look?  (\x)  I find it a bit weird that
 the first line shows a single-pixel wide line but the subsequent ones
 are thicker.

Yes, it's just due to the fact that the middle lines are using a
thicker line character, while the top and bottom lines are thin.
This can easily be changed to be e.g. all thin.

 BTW I think you should also look at multiline fields,
 
  a │   b  
  
 ━━━┿━━━
  4 │ some text
: and some more
: and then some
 (1 filas)
 
 And wrapped:
 
 alvherre=# select * from foo;
  a │ b
   
 ━━━┿
  5 │ En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha 
 mucho 
; tiempo que vivía un hidalgo
 (1 fila)

I initially left these exactly the same as for ASCII (the ':' and ';'
usage).  However, it's quite possible to make it use other characters.
We could use the same lines, or two, three or four dashed lines
('╎' and '╏', or ┆' and '┇' or '┊' and '┋').
There are also additional characters such as half-lines
('╶', '╷', '╹' and '╻').  Is this the kind of this you are referring
to?

The wrapping code also appears slightly broken anyway, since
continuation lines don't get '|' printed for subsequent blank columns
on the same line:

# SELECT * FROM testw;
 a │  b  │ c │ kfduagahkjdfghalksdkfhajsdkl 
━━━┿━┿━━━┿━━
 1 │ This is a test stri │   │  
   ; ng
 2 │ sbuild-createchroot │   │
   ;  installs build-ess
   ; entials so your wor
   ; king
 3 │ sbuild-createchroot │   │
   ;  installs build-ess
   ; entials so your wor
   ; king environment sh
   ; ould already be ok.
   ;  You will need only
   ;  to add a few more
   ; packages in the chr
   ; oot, using e.g.
(3 rows)

(this is unchanged from psql behaviour in CVS.)

I also see in the code that under some circumstances (curr_nl_line), a
'+' is used instead of a space when printing table headers, but I
haven't been able to trigger this yet.  We could also use dashed
horizontal rules here.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-27 Thread Alvaro Herrera
Roger Leigh escribió:
 On Tue, Aug 25, 2009 at 10:32:50PM -0400, Alvaro Herrera wrote:
  Roger Leigh escribió:
  
   An updated copy of the patch is attached.
  
  Did you give expanded output a look?  (\x)  I find it a bit weird that
  the first line shows a single-pixel wide line but the subsequent ones
  are thicker.
 
 Yes, it's just due to the fact that the middle lines are using a
 thicker line character, while the top and bottom lines are thin.
 This can easily be changed to be e.g. all thin.

Yeah, I think expanded output should use the same char for all lines.

  BTW I think you should also look at multiline fields,
  
   a │   b
 
  ━━━┿━━━
   4 │ some text
 : and some more
 : and then some
  (1 filas)
  
  And wrapped:
  
  alvherre=# select * from foo;
   a │ b  
  
  ━━━┿
   5 │ En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha 
  mucho 
 ; tiempo que vivía un hidalgo
  (1 fila)
 
 I initially left these exactly the same as for ASCII (the ':' and ';'
 usage).  However, it's quite possible to make it use other characters.
 We could use the same lines, or two, three or four dashed lines
 ('╎' and '╏', or ┆' and '┇' or '┊' and '┋').

This works for me (say ╎ for newline-separated strings and ┊ for wrapped
output).

 The wrapping code also appears slightly broken anyway, since
 continuation lines don't get '|' printed for subsequent blank columns
 on the same line:

Yeah ... maybe there's a point to this, or maybe it's just a bug -- I
don't know.

 I also see in the code that under some circumstances (curr_nl_line), a
 '+' is used instead of a space when printing table headers, but I
 haven't been able to trigger this yet.  We could also use dashed
 horizontal rules here.

Hmm, can't say I know what's this about.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-27 Thread Alvaro Herrera
Alvaro Herrera escribió:

  I initially left these exactly the same as for ASCII (the ':' and ';'
  usage).  However, it's quite possible to make it use other characters.
  We could use the same lines, or two, three or four dashed lines
  ('╎' and '╏', or ┆' and '┇' or '┊' and '┋').
 
 This works for me (say ╎ for newline-separated strings and ┊ for wrapped
 output).

So it'd look like this:

alvherre=# select * from foo;
 a │ b  
━━━┿
 1 │ En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha mucho 
   ┊ tiempo que vivía un hidalgo de los de lanza en astillero, adarga antigua, 
   ┊ rocín flaco y galgo corredor. Una olla de algo más vaca que carnero, salpi
   ┊ cón las más noches, duelos y quebrantos los sábados, lantejas los viernes,
   ┊  algún palomino de añadidura los domingos, consumían las tres partes de su
   ┊  hacienda. El resto della concluían sayo de velarte, calzas de velludo par
   ┊ a las fiestas, con sus pantuflos de lo mesmo, y los días de entresemana se
   ┊  honraba con su vellorí de lo más fino.  Tenía en su casa una ama que pasa
   ┊ ba de los cuarenta, y una sobrina que no llegaba a los veinte, y un mozo d
   ┊ e campo y plaza, que así ensillaba el rocín como tomaba la podadera.
 2 │ —«Nunca fuera caballero de
   ╎ damas tan bien servido
   ╎ como fuera don Quijote
   ╎ cuando de su aldea vino:
   ╎ doncellas curaban dél;
   ╎ princesas, del su rocino»
(2 filas)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-27 Thread Roger Leigh
On Thu, Aug 27, 2009 at 10:17:32AM -0400, Alvaro Herrera wrote:
 Alvaro Herrera escribió:
 
   I initially left these exactly the same as for ASCII (the ':' and ';'
   usage).  However, it's quite possible to make it use other characters.
   We could use the same lines, or two, three or four dashed lines
   ('╎' and '╏', or ┆' and '┇' or '┊' and '┋').
  
  This works for me (say ╎ for newline-separated strings and ┊ for wrapped
  output).
 
 So it'd look like this:
 
 alvherre=# select * from foo;
  a │ b
   
 ━━━┿
  1 │ En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha 
 mucho 
┊ tiempo que vivía un hidalgo de los de lanza en astillero, adarga 
 antigua, 
┊ rocín flaco y galgo corredor. Una olla de algo más vaca que carnero, 
 salpi
┊ cón las más noches, duelos y quebrantos los sábados, lantejas los 
 viernes,
┊  algún palomino de añadidura los domingos, consumían las tres partes de 
 su
┊  hacienda. El resto della concluían sayo de velarte, calzas de velludo 
 par
┊ a las fiestas, con sus pantuflos de lo mesmo, y los días de entresemana 
 se
┊  honraba con su vellorí de lo más fino.  Tenía en su casa una ama que 
 pasa
┊ ba de los cuarenta, y una sobrina que no llegaba a los veinte, y un mozo 
 d
┊ e campo y plaza, que así ensillaba el rocín como tomaba la podadera.
  2 │ —«Nunca fuera caballero de
╎ damas tan bien servido
╎ como fuera don Quijote
╎ cuando de su aldea vino:
╎ doncellas curaban dél;
╎ princesas, del su rocino»
 (2 filas)

The attached patch adds support for the above output in wrapped mode.
It also uses single dashes (half lines) to represent lines which
contain no data, just vertical padding.  Output is unchanged for
ASCII output.

The last bit I'm missing is how to handle column wrapping for aligned
text output.  This is was introduced on 8th May 2008 by Bryce Nesbitt,
committed by Bruce Momjian (git commit f92f424d).  This patch
introduced the wrapped format, but I have not been able to activate
the codepaths that wrap column names in the table header (adding '+').

The logic in print_aligned_text goes like this:

col_count = cont-ncolumns;
...
more_col_wrapping = col_count;
curr_nl_line = 0;
while (more_col_wrapping)
{
  for (i = 0; i  cont-ncolumns; i++)
if (!(this_line + 1)-ptr)   // Surely only true once at end of columns?
  more_col_wrapping--;
  curr_nl_line++;
}

I'm not convinced looking at the code that more_col_wrapping is ever
true except for the first time through the loop, and it's reached
zero after completion of the inner loop, so all of the code used when
curr_nl_line is  0 is never called.  I'm sure I'm interpreting this
incorrectly, but I can't see under what circumstances I would get the
column name header line wrapped over multiple lines.

It may be that I just haven't got a table with column names of the
correct number and lengths to trigger it?


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 7505cd4..f85e507 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -21,6 +21,9 @@
 #endif
 
 #include locale.h
+#ifdef HAVE_LANGINFO_H
+#include langinfo.h
+#endif
 
 #include catalog/pg_type.h
 #include pqsignal.h
@@ -356,38 +359,75 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 /* Aligned text		*/
 //
 
+static const printTextFormat asciiformat =
+{
+	{
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ ,  |, |, | }
+	},
+	:,
+	;,
+	 
+};
+
+static const struct printTextFormat utf8format =
+{
+	{
+	  /* ─, ┌, ┬, ┐ */
+	  { \342\224\200, \342\224\214, \342\224\254, \342\224\220 },
+	  /* ━, ┝, ┿, ┥ */
+	  { \342\224\201, \342\224\235, \342\224\277, \342\224\245 },
+	  /* ─, └, ┴, ┘ */
+	  { \342\224\200, \342\224\224, \342\224\264, \342\224\230 },
+	  /* N/A, │, │, │ */
+	  { , \342\224\202, \342\224\202, \342\224\202 },
+	},
+	  /* ╎ */
+	  \342\225\216,
+	  /* ┊ */
+	  \342\224\212,
+	  /* ╷ */
+	  \342\225\267
+};
 
 /* draw line */
 static void
 _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
-	   unsigned short border, FILE *fout)
+		   unsigned short border, printTextRule pos,
+		   const printTextFormat *format,
+		   FILE *fout)
 {
 	unsigned int i,
 j;
 
+	const printTextLineFormat *lformat = format-lrule[pos];
+
 	if (border == 1)
-		fputc('-', fout);
+		fputs(lformat-hrule, fout);
 	else if (border == 2)
-		fputs(+-, fout);
+		fprintf(fout, %s%s, lformat-leftvrule, lformat-hrule);
 
 	for (i = 0; i  ncolumns; i++)
 	{
 		for (j = 0; j  widths[i]; j++)
-			

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-25 Thread Roger Leigh
On Sun, Aug 23, 2009 at 06:33:49PM -0400, Alvaro Herrera wrote:
 Roger Leigh escribió:
 
  +#if (defined(HAVE_LANGINFO_H)  defined(CODESET))
  +   if (!strcmp(nl_langinfo(CODESET), UTF-8))
  +   text_format = utf8format;
  +#endif
 
 I think you should also try to match to UTF8, and do it in
 case-insensitive manner (pg_strcasecmp), just like chklocale.c.

An updated copy of the patch is attached.  This has the following
changes:

- matches UTF-8, utf8 and CP65001 case insensitively as for chklocale.c
- uses fprintf in place of repeated fputs/fputc calls
- moved hrule from table to line formatting
- moved vrule from table to separate line format

The latter two allow complete specification of all the characters used
to draw.  You can now have different horizonal rule types for the
different line types (demonstrated by making PRINT_RULE_MIDDLE thick
rather than thin).  The same applies to the vertical rules, but since
these are drawn intermixed with data these are a separate line format
in their own right (PRINT_RULE_DATA).

You can if you like make incredibly ugly tables consisting of all sorts
of thin, thick and double ruled lines running horizontally and
vertically at all edges and positions.  The patch as it is keeps this
subtle!  But, a potential future extension to this could be a \pset
command to customise the table style in addition to border printing,
since the code is now there to do this.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 7505cd4..1740109 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -21,6 +21,9 @@
 #endif
 
 #include locale.h
+#ifdef HAVE_LANGINFO_H
+#include langinfo.h
+#endif
 
 #include catalog/pg_type.h
 #include pqsignal.h
@@ -356,38 +359,66 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 /* Aligned text		*/
 //
 
+static const printTextFormat asciiformat =
+{
+	{
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ -, +, +, + },
+		{ ,  |, |, | }
+	}
+};
+
+static const struct printTextFormat utf8format =
+{
+	{
+	  /* ─, ┌, ┬, ┐ */
+	  { \342\224\200, \342\224\214, \342\224\254, \342\224\220 },
+	  /* ━, ┝, ┿, ┥ */
+	  { \342\224\201, \342\224\235, \342\224\277, \342\224\245 },
+	  /* ─, └, ┴, ┘ */
+	  { \342\224\200, \342\224\224, \342\224\264, \342\224\230 },
+	  /* N/A, │, │, │ */
+	  { , \342\224\202, \342\224\202, \342\224\202 }
+	},
+};
 
 /* draw line */
 static void
 _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
-	   unsigned short border, FILE *fout)
+		   unsigned short border, printTextRule pos,
+		   const printTextFormat *format,
+		   FILE *fout)
 {
 	unsigned int i,
 j;
 
+	const printTextLineFormat *lformat = format-lrule[pos];
+
 	if (border == 1)
-		fputc('-', fout);
+		fputs(lformat-hrule, fout);
 	else if (border == 2)
-		fputs(+-, fout);
+		fprintf(fout, %s%s, lformat-leftvrule, lformat-hrule);
 
 	for (i = 0; i  ncolumns; i++)
 	{
 		for (j = 0; j  widths[i]; j++)
-			fputc('-', fout);
+			fputs(lformat-hrule, fout);
 
 		if (i  ncolumns - 1)
 		{
 			if (border == 0)
 fputc(' ', fout);
 			else
-fputs(-+-, fout);
+fprintf(fout, %s%s%s, lformat-hrule,
+	lformat-midvrule, lformat-hrule);
 		}
 	}
 
 	if (border == 2)
-		fputs(-+, fout);
+		fprintf(fout, %s%s, lformat-hrule, lformat-rightvrule);
 	else if (border == 1)
-		fputc('-', fout);
+		fputs(lformat-hrule, fout);
 
 	fputc('\n', fout);
 }
@@ -397,7 +428,8 @@ _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
  *	Print pretty boxes around cells.
  */
 static void
-print_aligned_text(const printTableContent *cont, FILE *fout)
+print_aligned_text(const printTableContent *cont, const printTextFormat *format,
+		   FILE *fout)
 {
 	bool		opt_tuples_only = cont-opt-tuples_only;
 	bool		opt_numeric_locale = cont-opt-numericLocale;
@@ -431,6 +463,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 	int		   *bytes_output;	/* Bytes output for column value */
 	int			output_columns = 0;		/* Width of interactive console */
 	bool		is_pager = false;
+	const printTextLineFormat	*dformat = format-lrule[PRINT_RULE_DATA];
 
 	if (cancel_pressed)
 		return;
@@ -709,7 +742,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			int			curr_nl_line;
 
 			if (opt_border == 2)
-_print_horizontal_line(col_count, width_wrap, opt_border, fout);
+_print_horizontal_line(col_count, width_wrap, opt_border, PRINT_RULE_TOP, format, fout);
 
 			for (i = 0; i  col_count; i++)
 pg_wcsformat((unsigned char *) cont-headers[i],
@@ -722,7 +755,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			while (more_col_wrapping)
 			{
 if (opt_border == 2)
-		

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-25 Thread Alvaro Herrera
Roger Leigh escribió:

 An updated copy of the patch is attached.

I give it a try.  It looks very reasonable.  We could argue about the
exact chars to use but that can be changed later.

Did you give expanded output a look?  (\x)  I find it a bit weird that
the first line shows a single-pixel wide line but the subsequent ones
are thicker.

BTW I think you should also look at multiline fields,

 a │   b
   
━━━┿━━━
 4 │ some text
   : and some more
   : and then some
(1 filas)

And wrapped:

alvherre=# select * from foo;
 a │ b  
━━━┿
 5 │ En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha mucho 
   ; tiempo que vivía un hidalgo
(1 fila)


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-23 Thread Roger Leigh
On Sun, Aug 23, 2009 at 01:49:02AM +0100, Roger Leigh wrote:
 On Sat, Aug 22, 2009 at 07:42:10PM -0400, Robert Haas wrote:
  On Sat, Aug 22, 2009 at 2:13 PM, Roger Leighrle...@debian.org wrote:
   Further minor cleanups to tweak column alignment in a corner case,
   and to correct indentation to match the rest of the code.
  
  Please read the guidelines here:
  http://wiki.postgresql.org/wiki/Submitting_a_Patch
  
  I don't think it's very helpful to submit a patch intended to do
  basically one thing split up over 10 threads.  The PostgreSQL style is
  heavyweight commits, and I don't believe that there's any reason to
  suppose that someone would want to commit any one of these 9 pieces
  without the other 8.
 
 OK.  I have attached a single patch which combines the nine patches
 into one.  I did read the patch page above, but it didn't mention

A second patch is attached.  This is identical to the previous one,
except that it's re-diffed against the REL8_4_STABLE branch for
anyone who wishes to use this with 8.4.

Again, apologies for the noise with my first set of patches.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 7505cd4..10faeb3 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -21,6 +21,9 @@
 #endif
 
 #include locale.h
+#ifdef HAVE_LANGINFO_H
+#include langinfo.h
+#endif
 
 #include catalog/pg_type.h
 #include pqsignal.h
@@ -356,38 +359,76 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 /* Aligned text		*/
 //
 
+static const printTextFormat asciiformat =
+{
+	{
+		{ +, +, + },
+		{ +, +, + },
+		{ +, +, + }
+	},
+	-,
+	|
+};
+
+static const struct printTextFormat utf8format =
+{
+	{
+	  /* ┌, ┬, ┐ */
+	  { \342\224\214, \342\224\254, \342\224\220 },
+	  /* ├, ┼, ┤ */
+	  { \342\224\234, \342\224\274, \342\224\244 },
+	  /* └, ┴, ┘ */
+	  { \342\224\224, \342\224\264, \342\224\230 }
+	},
+	\342\224\200, /* ─ */
+	\342\224\202  /* │ */
+};
 
 /* draw line */
 static void
 _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
-	   unsigned short border, FILE *fout)
+		   unsigned short border, printTextRule pos,
+		   const printTextFormat *format,
+		   FILE *fout)
 {
 	unsigned int i,
 j;
 
+	const printTextLineFormat *lformat = format-lrule[pos];
+
 	if (border == 1)
-		fputc('-', fout);
+		fputs(format-hrule, fout);
 	else if (border == 2)
-		fputs(+-, fout);
+	{
+		fputs(lformat-leftvrule, fout);
+		fputs(format-hrule, fout);
+	}
 
 	for (i = 0; i  ncolumns; i++)
 	{
 		for (j = 0; j  widths[i]; j++)
-			fputc('-', fout);
+			fputs(format-hrule, fout);
 
 		if (i  ncolumns - 1)
 		{
 			if (border == 0)
 fputc(' ', fout);
 			else
-fputs(-+-, fout);
+			{
+fputs(format-hrule, fout);
+fputs(lformat-midvrule, fout);
+fputs(format-hrule, fout);
+			}
 		}
 	}
 
 	if (border == 2)
-		fputs(-+, fout);
+	{
+		fputs(format-hrule, fout);
+		fputs(lformat-rightvrule, fout);
+	}
 	else if (border == 1)
-		fputc('-', fout);
+		fputs(format-hrule, fout);
 
 	fputc('\n', fout);
 }
@@ -397,7 +438,8 @@ _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
  *	Print pretty boxes around cells.
  */
 static void
-print_aligned_text(const printTableContent *cont, FILE *fout)
+print_aligned_text(const printTableContent *cont, const printTextFormat *format,
+		   FILE *fout)
 {
 	bool		opt_tuples_only = cont-opt-tuples_only;
 	bool		opt_numeric_locale = cont-opt-numericLocale;
@@ -709,7 +751,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			int			curr_nl_line;
 
 			if (opt_border == 2)
-_print_horizontal_line(col_count, width_wrap, opt_border, fout);
+_print_horizontal_line(col_count, width_wrap, opt_border, PRINT_RULE_TOP, format, fout);
 
 			for (i = 0; i  col_count; i++)
 pg_wcsformat((unsigned char *) cont-headers[i],
@@ -722,7 +764,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 			while (more_col_wrapping)
 			{
 if (opt_border == 2)
-	fprintf(fout, |%c, curr_nl_line ? '+' : ' ');
+	fprintf(fout, %s%c, format-vrule, curr_nl_line ? '+' : ' ');
 else if (opt_border == 1)
 	fputc(curr_nl_line ? '+' : ' ', fout);
 
@@ -753,19 +795,22 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		if (opt_border == 0)
 			fputc(curr_nl_line ? '+' : ' ', fout);
 		else
-			fprintf(fout,  |%c, curr_nl_line ? '+' : ' ');
+			fprintf(fout,  %s%c, format-vrule, curr_nl_line ? '+' : ' ');
 	}
 }
 curr_nl_line++;
 
 if (opt_border == 2)
-	fputs( |, fout);
+{
+	fputc(' ', fout);
+	fputs(format-vrule, fout);
+}
 else if (opt_border == 1)
 	fputc(' ', fout);
 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-23 Thread Alvaro Herrera
Roger Leigh escribió:

 +#if (defined(HAVE_LANGINFO_H)  defined(CODESET))
 + if (!strcmp(nl_langinfo(CODESET), UTF-8))
 + text_format = utf8format;
 +#endif

I think you should also try to match to UTF8, and do it in
case-insensitive manner (pg_strcasecmp), just like chklocale.c.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-22 Thread Robert Haas
On Sat, Aug 22, 2009 at 2:13 PM, Roger Leighrle...@debian.org wrote:
 Further minor cleanups to tweak column alignment in a corner case,
 and to correct indentation to match the rest of the code.

Please read the guidelines here:
http://wiki.postgresql.org/wiki/Submitting_a_Patch

I don't think it's very helpful to submit a patch intended to do
basically one thing split up over 10 threads.  The PostgreSQL style is
heavyweight commits, and I don't believe that there's any reason to
suppose that someone would want to commit any one of these 9 pieces
without the other 8.  So I'd suggest you resend it as a single patch
and then add it here.

https://commitfest.postgresql.org/action/commitfest_view/open

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-22 Thread Roger Leigh
On Sat, Aug 22, 2009 at 07:42:10PM -0400, Robert Haas wrote:
 On Sat, Aug 22, 2009 at 2:13 PM, Roger Leighrle...@debian.org wrote:
  Further minor cleanups to tweak column alignment in a corner case,
  and to correct indentation to match the rest of the code.
 
 Please read the guidelines here:
 http://wiki.postgresql.org/wiki/Submitting_a_Patch
 
 I don't think it's very helpful to submit a patch intended to do
 basically one thing split up over 10 threads.  The PostgreSQL style is
 heavyweight commits, and I don't believe that there's any reason to
 suppose that someone would want to commit any one of these 9 pieces
 without the other 8.

OK.  I have attached a single patch which combines the nine patches
into one.  I did read the patch page above, but it didn't mention
anything on patch size, so I split it into incremental logical
changes in order to make it easily reviewable.  6-9 can be viewed
as a whole, since 7-9 are minor fixes to 6.

The other information requested on that page:

project: postgresql
patch name: psql-utf8-table-1.patch
purpose: adds support for Unicode UTF-8 box drawing to the text
  table drawing functions print_aligned_text and
  print_aligned_vertical.
for: discussion or application.  Testing shows identical formatting
  to current code.
branch: HEAD
compiles: yes
platform-specific: POSIX, but contains appropriate preprocessor
  conditionals for portability.  Not tested on non-POSIX platform,
  but conditional taken from elsewhere (port/chklocale.c).  Only
  two lines of the patch are platform-specific, the rest is
  plain portable C.
regression tests: No.  I'm not aware of any psql regression tests
  testing output formatting.
  Manually confirmed output is unchanged with border=0/1/2 and
  expanded=0/1.
use: Use a locale with LC_CTYPE CODESET=UTF-8.
performance: Not tested, but should be minimal effect.  There's
  an additional pointer dereference during string formatting
  in some places.  Temporary copies of data are used in some
  functions e.g. of format-lrule[] for convenience and to potentially
  save on the number of dereferences, though any optimising compiler
  should make this cost zero (all data and pointers are const).
comments:

psql: Abstract table formatting characters used for different line types.

printTextLineFormat describes the characters used to draw vertical lines across
a horizontal rule at the left side, middle and right hand side. These are
included in the formatting for an entire table (printTextFormat).  The
printTextRule enum is used as an offset into the printTextFormat line rules
(lrule), allowing specification of line styles for the top, middle and bottom
horizontal lines in a table.  The other printTextFormat members, hrule and
vrule define the formatting needed to draw horizontal and vertical rules.
Add table formats for ASCII and UTF-8.  Default to using the ASCII table.
However, if a UTF-8 locale codeset is in use, switch to the UTF-8 table.
print_aligned_text and print_aligned_vertical, and their helper fuctions pass
the table formatting and (where applicable) line style information to allow
correct printing of table lines.
Convert print_aligned_text, and its helper function, to use table formatting in
place of hardcoded ASCII characters.  Convert print_aligned_vertical to use
table formatting in place of hardcoded ASCII characters, and add helper
function print_aligned_vertical_line to format individual lines.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 7505cd4..10faeb3 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -21,6 +21,9 @@
 #endif
 
 #include locale.h
+#ifdef HAVE_LANGINFO_H
+#include langinfo.h
+#endif
 
 #include catalog/pg_type.h
 #include pqsignal.h
@@ -356,38 +359,76 @@ print_unaligned_vertical(const printTableContent *cont, FILE *fout)
 /* Aligned text		*/
 //
 
+static const printTextFormat asciiformat =
+{
+	{
+		{ +, +, + },
+		{ +, +, + },
+		{ +, +, + }
+	},
+	-,
+	|
+};
+
+static const struct printTextFormat utf8format =
+{
+	{
+	  /* ┌, ┬, ┐ */
+	  { \342\224\214, \342\224\254, \342\224\220 },
+	  /* ├, ┼, ┤ */
+	  { \342\224\234, \342\224\274, \342\224\244 },
+	  /* └, ┴, ┘ */
+	  { \342\224\224, \342\224\264, \342\224\230 }
+	},
+	\342\224\200, /* ─ */
+	\342\224\202  /* │ */
+};
 
 /* draw line */
 static void
 _print_horizontal_line(const unsigned int ncolumns, const unsigned int *widths,
-	   unsigned short border, FILE *fout)
+		   unsigned short border, printTextRule pos,
+		   const printTextFormat *format,
+		   FILE *fout)
 {
 	unsigned int i,
 j;
 
+	const printTextLineFormat *lformat = format-lrule[pos];
+
 	if (border == 1)
-		fputc('-', fout);
+		fputs(format-hrule, fout);
 	else 

Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-22 Thread Greg Stark
On Sun, Aug 23, 2009 at 1:49 AM, Roger Leighrle...@codelibre.net wrote:
 Please read the guidelines here:
 http://wiki.postgresql.org/wiki/Submitting_a_Patch

 I don't think it's very helpful to submit a patch intended to do
 basically one thing split up over 10 threads.  The PostgreSQL style is
 heavyweight commits, and I don't believe that there's any reason to
 suppose that someone would want to commit any one of these 9 pieces
 without the other 8.

 OK.  I have attached a single patch which combines the nine patches
 into one.  I did read the patch page above, but it didn't mention
 anything on patch size, so I split it into incremental logical
 changes in order to make it easily reviewable.  6-9 can be viewed
 as a whole, since 7-9 are minor fixes to 6.

I don't really have an opinion between one big patch versus multiple
smaller patches. That will come down to the code and whether the
separate patches are easier to read. It is sometimes hard to review a
patch when its separated from the code which needs the new
functionality though.

What I do think is getting my inbox suddenly blatted with a screenful
of patches from one author on one topicis pretty annoying. At the very
least please send all the patches as attachments to a single email. I
find it hard to understand how the linux-kernel list functions if
everyone's patches are all mixed up together and you can't save a
message and have all the related code saved but have to go back to the
list to find emails with similar names.

But the more fundamental failure here is that this email should have
been sent first. You dumped a whole pile of changes on us without ever
asking what we thought of the idea which is not how it's done. It's
actually not as bad as it looked because it's actually not as many
changes as it looked like at first, but still.

Now I happen to like the idea of using UTF8 border characters (I
hacked emacs-w3 to use the terminal font in X to do something similar
bitd so maybe I'm biased by nostalgia though). And the changes are
actually pretty simple despite looking bigger than they really are.
They're actually only a few dozen lines.

I wonder if there's a UTF8 tab character or monospaced space character
or something which would make these tables display better in my mail
client. Currently they're completely messed up because the display
engine doesn't recognize they should be monospaced whereas the usual
output does get recognized. As a result the padding all gets collapsed
and everything is misaligned.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode UTF-8 table formatting for psql text output

2009-08-22 Thread Andrew Dunstan



Greg Stark wrote:

I don't really have an opinion between one big patch versus multiple
smaller patches. That will come down to the code and whether the
separate patches are easier to read. It is sometimes hard to review a
patch when its separated from the code which needs the new
functionality though.


  


I have an opinion. These patches involved multiple changes to the same 
few files.


Please, don't do that.

In general one patch per feature is what makes life for reviewers and 
committers easier. For big features, sometimes we ask people to break it 
up into several patches, but this wasn't a big feature at all, and there 
is not the slightest need for more than one patch.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers