Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-09 Thread Daniel Verite
Alvaro Herrera wrote:

> I pushed it.

That's awesome, thanks! Also thanks to Pavel who reviewed and helped
continuously when iterating on this feature, and all others who
participed in this thread.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Alvaro Herrera
Daniel Verite wrote:

> ISTM that this could be avoided by erroring out for lack of an
> explicit 3rd column as argument. IOW, we wouldn't assume
> that "no column specified" means "show all columns". 
> 
> About simply ripping out the possibility of having multiple
> columns into cells, it's more radical but if that part turns out to
> be more confusing than useful, I don't have a problem
> with removing it.

Okay, I've ripped that out since I wasn't comfortable with the general
idea.  Once you have two data values for the same cell, the new code
raises an error, indicating the corresponding vertical and horizontal
header values; that way it's easy to spot where the problem is.

I also removed the FETCH_COUNT bits; it didn't make a lot of sense to
me.  Like \gexec, the query is executed to completion when in
\crosstabview regardless of FETCH_COUNT.

> The other case of stringing multiple contents into the same cell
> is when different tuples carry (row,column) duplicates.
> I'm not inclined to disallow that case, I think it would go too far
> in guessing what the user expects.
> My expectation for a viewer is that it displays the results as far as
> possible, whatever they are. 

The reason I made this case throw an error is that we can tweak the
behavior later.  I think separating them with newlines is too cute and
will be unusable when you have values that have embedded newlines; you
can imitate that behavior with string_agg(val, E'\n') as I've done in
the regression tests.  One option for improving it would be to have it
add another record, but that requires shifting the values of all cells
by the number of columns (you can see that if you change the border
options, or in HTML output etc).  We can do that later.

> Also, showing such contents in vertically-growing cells as it
> does now allows the user to spot these easily in the grid when
> they happen to be outliers. I'm seeing it as useful in that case.

It's useful, no doubt.

I pushed it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Alvaro Herrera
Robert Haas wrote:

> This seems like it might be converging on some sort of consensus, but
> I'm wondering if we shouldn't push it to 9.7, instead of rushing
> decisions that we will later have trouble changing on
> backward-compatibility grounds.

My intention is to commit this afternoon in the next couple of hours,
and only the most basic case is going to be supported, and the rest of
the cases (concatenation of several fields and several rows, etc) are
just going to throw errors; that way, it will be easy to add more
features later as they are agreed upon.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Robert Haas
On Fri, Apr 8, 2016 at 7:23 AM, Daniel Verite  wrote:
> Alvaro Herrera wrote:
>
>> I wonder if the business of appending values of multiple columns
>> separated with spaces is doing us any good.  Why not require that
>> there's a single column in the cell?  If the user wants to put things
>> together, they can use format() or just || the fields together.  What
>> benefit is there to the ' '?  When I ran my first test queries over
>> pg_class I was surprised about this behavior:
>> alvherre=# select * from pg_class
>> alvherre=# \crosstabview relnatts relkind
>
> ISTM that this could be avoided by erroring out for lack of an
> explicit 3rd column as argument. IOW, we wouldn't assume
> that "no column specified" means "show all columns".
>
> About simply ripping out the possibility of having multiple
> columns into cells, it's more radical but if that part turns out to
> be more confusing than useful, I don't have a problem
> with removing it.
>
> The other case of stringing multiple contents into the same cell
> is when different tuples carry (row,column) duplicates.
> I'm not inclined to disallow that case, I think it would go too far
> in guessing what the user expects.
> My expectation for a viewer is that it displays the results as far as
> possible, whatever they are.
> Also, showing such contents in vertically-growing cells as it
> does now allows the user to spot these easily in the grid when
> they happen to be outliers. I'm seeing it as useful in that case.

This seems like it might be converging on some sort of consensus, but
I'm wondering if we shouldn't push it to 9.7, instead of rushing
decisions that we will later have trouble changing on
backward-compatibility grounds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Daniel Verite
Alvaro Herrera wrote:

> I wonder if the business of appending values of multiple columns
> separated with spaces is doing us any good.  Why not require that
> there's a single column in the cell?  If the user wants to put things
> together, they can use format() or just || the fields together.  What
> benefit is there to the ' '?  When I ran my first test queries over
> pg_class I was surprised about this behavior:
> alvherre=# select * from pg_class
> alvherre=# \crosstabview relnatts relkind

ISTM that this could be avoided by erroring out for lack of an
explicit 3rd column as argument. IOW, we wouldn't assume
that "no column specified" means "show all columns". 

About simply ripping out the possibility of having multiple
columns into cells, it's more radical but if that part turns out to
be more confusing than useful, I don't have a problem
with removing it.

The other case of stringing multiple contents into the same cell
is when different tuples carry (row,column) duplicates.
I'm not inclined to disallow that case, I think it would go too far
in guessing what the user expects.
My expectation for a viewer is that it displays the results as far as
possible, whatever they are. 
Also, showing such contents in vertically-growing cells as it
does now allows the user to spot these easily in the grid when
they happen to be outliers. I'm seeing it as useful in that case.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Daniel Verite
Alvaro Herrera wrote:

> I messed with that code some more, as it looked unnecessarily
> complicated; please see attached and verify that it still behaves
> sanely.  This needs those regression tests you promised.  I tested a few
> cases and it seems good to me.

I've fixed a couple things over v16:
- avoid passing every cell through psprintf, which happened due
  to cont.cells being pre-initialized to empty strings.
- adjusted the loop freeing allocated_cells

and added the regression tests.

Attached is the diff over v16, tested with make check and valgrind.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c
index 0d70e47..a20296e 100644
--- a/src/bin/psql/crosstabview.c
+++ b/src/bin/psql/crosstabview.c
@@ -360,7 +360,6 @@ printCrosstab(const PGresult *results,
 	printQueryOpt popt = pset.popt;
 	printTableContent cont;
 	int			i,
-j,
 rn;
 	char		col_align;
 	int		   *horiz_map;
@@ -414,9 +413,6 @@ printCrosstab(const PGresult *results,
 		cont.cells[k * (num_columns + 1)] = piv_rows[i].name ?
 			piv_rows[i].name :
 			(popt.nullPrint ? popt.nullPrint : "");
-		/* Initialize all cells inside the grid to an empty value */
-		for (j = 0; j < num_columns; j++)
-			cont.cells[k * (num_columns + 1) + j + 1] = "";
 	}
 	cont.cellsadded = num_rows * (num_columns + 1);
 
@@ -506,14 +502,10 @@ printCrosstab(const PGresult *results,
 		 * first column of each row, separate with a newline
 		 * instead.
 		 */
-		if (allocated_cells[idx] != NULL)
-			new_content = psprintf("%s%s%s",
-   allocated_cells[idx],
-   i == 0 ? "\n" : " ",
-   content);
-		else
-			new_content = psprintf("%s", content);
-
+		new_content = psprintf("%s%s%s",
+			   cont.cells[idx],
+			   i == 0 ? "\n" : " ",
+			   content);
 		cont.cells[idx] = new_content;
 		if (allocated_cells[idx] != NULL)
 			pg_free(allocated_cells[idx]);
@@ -528,10 +520,20 @@ printCrosstab(const PGresult *results,
 		}
 	}
 
+	/*
+	 * The non-initialized cells must be set to an empty string for the print
+	 * functions
+	 */
+	for (i = 0; i < cont.cellsadded; i++)
+	{
+		if (cont.cells[i] == NULL)
+			cont.cells[i] = "";
+	}
+
 	printTable(, pset.queryFout, false, pset.logfile);
 	printTableCleanup();
 
-	for (i = 0; i < num_rows * num_columns; i++)
+	for (i = 0; i < (num_rows + 1) * (num_columns + 1); i++)
 	{
 		if (allocated_cells[i] != NULL)
 			pg_free(allocated_cells[i]);
diff --git a/src/test/regress/expected/psql_crosstabview.out b/src/test/regress/expected/psql_crosstabview.out
new file mode 100644
index 000..df3824a
--- /dev/null
+++ b/src/test/regress/expected/psql_crosstabview.out
@@ -0,0 +1,158 @@
+--
+-- tests for \crosstabview
+--
+CREATE VIEW vct_data as 
+select * from ( values
+   ('v1','h2','foo', 3, '2015-04-01'::date),
+   ('v2','h1','bar', 3, '2015-01-02'),
+   ('v1','h0','baz', NULL, '2015-07-12'),
+   ('v0','h4','qux', 4, '2015-07-15'),
+   ('v0','h4','dbl', -3, '2014-12-15'),
+   ('v0',NULL,'qux', 5, '2014-03-15')
+ ) as l(v,h,c,i,d);
+-- 2 columns with implicit 'X' as 3rd column
+select v,i from vct_data order by 1,2 \crosstabview v i
+ v  | -3 | 4 | 5 | 3 |   
+++---+---+---+---
+ v0 | X  | X | X |   | 
+ v1 ||   |   | X | X
+ v2 ||   |   | X | 
+(3 rows)
+
+-- basic usage with 3 columns
+select v, extract(year from d),count(*) from vct_data
+ group by 1, 2 order by 1,2
+ \crosstabview
+ v  | 2014 | 2015 
++--+--
+ v0 |2 |1
+ v1 |  |2
+ v2 |  |1
+(3 rows)
+
+-- ordered months in horizontal header, enclosed column name
+select v, to_char(d,'Mon') as "month name", extract(month from d) as num,
+ count(*) from vct_data  group by 1,2,3 order by 1
+ \crosstabview v "month name":num 4
+ v  | Jan | Mar | Apr | Jul | Dec 
++-+-+-+-+-
+ v0 | |   1 | |   1 |   1
+ v1 | | |   1 |   1 |
+ v2 |   1 | | | |
+(3 rows)
+
+-- combine contents vertically into the same cell (V/H duplicates)
+select v,h,c from vct_data order by 1,2,3
+ \crosstabview 1 2 3
+ v  | h4  | | h0  | h2  | h1  
++-+-+-+-+-
+ v0 | dbl+| qux | | | 
+| qux | | | | 
+ v1 | | | baz | foo | 
+ v2 | | | | | bar
+(3 rows)
+
+-- horizontal ASC order from window function
+select v,h,c, row_number() over(order by h) as r from vct_data order by 1,3,2
+ \crosstabview v h:r c
+ v  | h0  | h1  | h2  | h4  | 
++-+-+-+-+-
+ v0 | | | | dbl+| qux
+| | | | qux | 
+ v1 | baz | | foo | | 
+ v2 | | bar | | | 
+(3 rows)
+
+-- horizontal DESC order from window function
+select v,h,c, row_number() over(order by h DESC) as r from vct_data order by 1,3,2
+ \crosstabview v h:r c
+ v  | | h4  | 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
Daniel Verite wrote:

> > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
> > block (line 497 in the attached), can't we do the same thing by using
> > psprintf?
> 
> In that block, we can't pass a cell contents as a valist and be done with
> that cell, because duplicates of (col value,row value) may happen
> at any iteration of the upper loop over PQntuples(results). Any cell really
> may need reallocation unpredictably until that loop is done, whereas
> psprintf starts by allocating a new buffer unconditionally, so it doesn't
> look
> to me like it could help to simplify that block.

I messed with that code some more, as it looked unnecessarily
complicated; please see attached and verify that it still behaves
sanely.  This needs those regression tests you promised.  I tested a few
cases and it seems good to me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d8b9a03..9c5a915 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+ the case of one row except that the values coming from different rows
+ are stacked vertically, the different source rows being separated by
+ newline characters inside the cell.
+
+
+
+
+
+
+
+  
+
+  
 \d[S+] [ pattern ]
 
 
@@ -4066,6 +4173,47 @@ first  | 4
 second | four
 
 
+
+  When suitable, query results can be shown in a crosstab representation
+  with the \crosstabview command:
+
+testdb= SELECT first, second, first  2 AS gt2 FROM my_table;
+ first | second | ge2 
+---++-
+ 1 | one| f
+ 2 | two| f
+ 3 | three  | t
+ 4 | four   | t
+(4 rows)
+
+testdb= \crosstabview first second
+ first | one | two | three | four 
+---+-+-+---+--
+ 1 | f   | |   | 
+ 2 | | f   |   | 
+ 3 | |  

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread David G. Johnston
On Thu, Apr 7, 2016 at 1:26 PM, Alvaro Herrera 
wrote:

> I wonder if the business of appending values of multiple columns
> separated with spaces is doing us any good.  Why not require that
> there's a single column in the cell?  If the user wants to put things
> together, they can use format() or just || the fields together.  What
> benefit is there to the ' '?  When I ran my first test queries over
> pg_class I was surprised about this behavior:
>
> alvherre=# select * from pg_class
> alvherre=# \crosstabview relnatts relkind
>
>  relnatts |
>r
> |   t
>   |
> i |
>v
>
> --+++--+---
>26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f
> f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)
> |
>   |
>   |
>30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n
> 540 1 {=r/alvherre} (null)
>   |
> |
> |
> 3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f
> f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)
>+| pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t
> n 540 1 (null) (null)+| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5
> 688 0 0 f f p 0 f f f f f f f t n 0 0 (null) (null)+|
> pg_group 11 11661 0 10 0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0
> {=r/alvherre} (null)
>   +
>
>
> I'm tempted to rip that out, unless you have a reason not to.
>
> In fact, I think even the grouping of values of multiple rows with \n is
> not terribly great either.  Why not just require people to group the
> values beforehand?  You can use "string_agg(column, E'\n')" to get the
> same behavior, plus you can do other things such as sum() etc.
>

​Went and looked at the examples page and at first blush it seems like this
module only understands text.  My specific concern here is dealing with
"numbers-as-text" sorting.​

​As to the question of behavior when multiple columns (and rows?) are
present: ​we need some sort of default do we not.  Nothing is precluding
the user from doing their own aggregates and limiting the select-list.
That said I'm more inclined to error if the input data in not unique on
(v,h).  I feel the possibility of a user query bug going unnoticed in that
scenario is reasonably large since its likely that only some combinations
of duplicates appear.  I'm a bit less tentative regarding column
concatenation since I would expect that nearly every cell involved in the
output would be noticeably affected.  Though, if we are going to protect
against extra rows extending that to protect against extra columns seems
fair.

Another option is, possibly conditioned on the first two columns being the
headers, to only take the column in the third position (or, the first
unassigned column).and display it.

Otherwise if multiple candidate columns are present and none are chosen for
the cell we could just error and force the user to explicitly choose.

The concatenation behavior seems like the least useful default.  I'm
inclined to favor the first unassigned input column.  And never allow (v,h)
is violate uniqueness.

David J.


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Daniel Verite wrote:

> > > * A few examples in docs.  The psql manpage should have at least two new
> > > examples showing the crosstab features, one with the simplest case you
> > > can think of, and another one showing all the features.
> > 
> > Added that in the EXAMPLES section at the very end of the manpage.
> 
> Ok.  Seems a bit too short to me, and I don't like the fact that you
> can't actually run it because you need to create the my_table
> beforehand.  I think it'd be better if you used a VALUES clause there,
> so that the reader can cut'n paste into psql to start to play with the
> feature.

Oh, I noticed now that my_table was created by previous examples.
Nevermind.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
I wonder if the business of appending values of multiple columns
separated with spaces is doing us any good.  Why not require that
there's a single column in the cell?  If the user wants to put things
together, they can use format() or just || the fields together.  What
benefit is there to the ' '?  When I ran my first test queries over
pg_class I was surprised about this behavior:

alvherre=# select * from pg_class
alvherre=# \crosstabview relnatts relkind

 relnatts | 
  r|
   t
|i  
   |
   v
   
--+++--+---
   26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f f f 
f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)  |   

 |  
| 
   30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n 540 1 
{=r/alvherre} (null)|   

 |  
| 
3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f f f 
f t n 540 1 {alvherre=arwdDxt/alvherre} (null) +| 
pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t n 540 1 
(null) (null)+| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5 688 0 0 f f p 
0 f f f f f f f t n 0 0 (null) (null)+| pg_group 11 11661 0 10 
0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0 {=r/alvherre} (null)
  +


I'm tempted to rip that out, unless you have a reason not to.

In fact, I think even the grouping of values of multiple rows with \n is
not terribly great either.  Why not just require people to group the
values beforehand?  You can use "string_agg(column, E'\n')" to get the
same behavior, plus you can do other things such as sum() etc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
Daniel Verite wrote:

> > regression=# select * from pg_class \crosstabview relnatts 
> > \crosstabview: missing second argument
> > regression-# 
> 
> Fixed. This was modelled after the behavior of:
>  select 1 \badcommand
> but I've changed to mimic what happens with:
>   select 1 \g /some/invalid/path
> the query buffer is not discarded by the error but the prompt
> is ready for a fresh new command.

Works for me.

> > * A few examples in docs.  The psql manpage should have at least two new
> > examples showing the crosstab features, one with the simplest case you
> > can think of, and another one showing all the features.
> 
> Added that in the EXAMPLES section at the very end of the manpage.

Ok.  Seems a bit too short to me, and I don't like the fact that you
can't actually run it because you need to create the my_table
beforehand.  I think it'd be better if you used a VALUES clause there,
so that the reader can cut'n paste into psql to start to play with the
feature.

> > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
> > block (line 497 in the attached), can't we do the same thing by using
> > psprintf?
> 
> In that block, we can't pass a cell contents as a valist and be done with
> that cell, because duplicates of (col value,row value) may happen
> at any iteration of the upper loop over PQntuples(results). Any cell really
> may need reallocation unpredictably until that loop is done, whereas
> psprintf starts by allocating a new buffer unconditionally, so it doesn't
> look to me like it could help to simplify that block.

I don't know what you mean, but here's what I meant.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c
index b3510b9..0216dae 100644
--- a/src/bin/psql/crosstabview.c
+++ b/src/bin/psql/crosstabview.c
@@ -496,12 +496,12 @@ printCrosstab(const PGresult *results, int num_columns,
 {
 	src_col = colsG[i];
 
-	content = (!PQgetisnull(results, rn, src_col)) ?
+	content = !PQgetisnull(results, rn, src_col) ?
 		PQgetvalue(results, rn, src_col) :
 		(popt.nullPrint ? popt.nullPrint : "");
 }
 
-if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')
+if (cont.cells[idx] != NULL)
 {
 	/*
 	 * Multiple values for the same (row,col) are projected
@@ -509,12 +509,9 @@ printCrosstab(const PGresult *results, int num_columns,
 	 * previous content of the cell from the new value by a
 	 * newline.
 	 */
-	int			content_size;
 	char	   *new_content;
 	int			idx2;
 
-	content_size = strlen(cont.cells[idx]) + 2 + strlen(content) + 1;
-
 	/*
 	 * idx2 is an index into allocated_cells. It differs from
 	 * idx (index into cont.cells), because vertical and
@@ -524,34 +521,17 @@ printCrosstab(const PGresult *results, int num_columns,
 	idx2 = (row_number * num_columns) + col_number;
 
 	if (allocated_cells[idx2] != NULL)
-	{
-		new_content = pg_realloc(allocated_cells[idx2], content_size);
-	}
+		new_content = psprintf("%s%s%s",
+			   allocated_cells[idx2],
+			   i == 0 ? "\n" : " ",
+			   content);
 	else
-	{
-		/*
-		 * At this point, cont.cells[idx] still contains a
-		 * PQgetvalue() pointer.  Just after, it will contain
-		 * a new pointer maintained in allocated_cells[], and
-		 * freed at the end of this function.
-		 */
-		new_content = pg_malloc(content_size);
-		strcpy(new_content, cont.cells[idx]);
-	}
-	cont.cells[idx] = new_content;
-	allocated_cells[idx2] = new_content;
+		new_content = psprintf("%s", content);
 
-	/*
-	 * Contents that are on adjacent columns in the source
-	 * results get separated by one space in the target.
-	 * Contents that are on different rows in the source get
-	 * separated by newlines in the target.
-	 */
-	if (i == 0)
-		strcat(new_content, "\n");
-	else
-		strcat(new_content, " ");
-	strcat(new_content, content);
+	cont.cells[idx] = new_content;
+	if (allocated_cells[idx2] != NULL)
+		pg_free(allocated_cells[idx2]);
+	allocated_cells[idx2] = new_content;
 }
 else
 {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Daniel Verite
Alvaro Herrera wrote:


Thanks for looking into that patch!

> regression=# select * from pg_class \crosstabview relnatts 
> \crosstabview: missing second argument
> regression-# 

Fixed. This was modelled after the behavior of:
 select 1 \badcommand
but I've changed to mimic what happens with:
  select 1 \g /some/invalid/path
the query buffer is not discarded by the error but the prompt
is ready for a fresh new command.

> alvherre=# select * from pg_class \crosstabview relnatts relkinda
> Invalid column name: relkinda
> alvherre=# select 1;
> The query must return at least two columns to be shown in crosstab

Definitely a bug. Fixed.

Also fixed a one-off bug with quoted columns: in parseColumnRefs(),
first call to PQmblen(), I wrongly assumed that
PQmblen("", ..) returns 0, whereas in fact it returns 1.

> * A few examples in docs.  The psql manpage should have at least two new
> examples showing the crosstab features, one with the simplest case you
> can think of, and another one showing all the features.

Added that in the EXAMPLES section at the very end of the manpage.

> * Add regression test cases somewhere for the regression database.
> Probably use "FROM tenk1 WHERE hundred < 5", which provides you with 500
> rows, enough for many interesting games.  Make sure to test all the
> provided features.  I would use a new psql.sql file for this.

Looking into regression tests, not yet done.

> * How did you come up with the 1600 value?  Whatever it is, please use a
> #define instead of hardcoding it.

Done with accompanying comment in crosstabview.h 

> * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
> block (line 497 in the attached), can't we do the same thing by using
> psprintf?

In that block, we can't pass a cell contents as a valist and be done with
that cell, because duplicates of (col value,row value) may happen
at any iteration of the upper loop over PQntuples(results). Any cell really
may need reallocation unpredictably until that loop is done, whereas
psprintf starts by allocating a new buffer unconditionally, so it doesn't
look
to me like it could help to simplify that block.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d8b9a03..1072733 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+  

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-06 Thread Alvaro Herrera
I've been looking at this patch.  First thing was to rebase on top of
recent psql code restructuring; second, pgindent; third, reordered the
code in crosstabview.c more sensibly (had to add prototypes).  New
version attached.

Then I looked at the docs to try to figure out exactly how it works.
I'm surprised that there's not a single example added to the psql
manpage.  Please add one.

I then tested it a bit, "kick the tires" so to speak.  I noticed that
error handling is broken.  For instance, observe the query prompt after
the error:

regression=# select * from pg_class \crosstabview relnatts 
\crosstabview: missing second argument
regression-# 

At this point the query buffer contains the query (you can see it with
\e), which seems bogus to me.  The query buffer needs to be reset.
Compare \gexec:
alvherre=# select 1 \gexec
ERROR:  error de sintaxis en o cerca de «1»
LÍNEA 1: 1
 ^
alvherre=# 


Also, using bogus column names as arguments cause state to get all
bogus:

alvherre=# select * from pg_class \crosstabview relnatts relkinda
Invalid column name: relkinda
alvherre=# select 1;
The query must return at least two columns to be shown in crosstab

Note that the second query is not crosstab at all, yet the error message
is entirely bogus.  This one is probably the same bug:

alvherre=# select 'one', 'two';
Invalid column name: relnatts

Apparently, once in that state, not even a successful query crosstab
display resets the state correctly:

alvherre=# select * from pg_class \crosstabview relnatts relkinda
Invalid column name: relkinda
alvherre=# select 'one' as relnatts, 'two' as relkinda \crosstabview
 relnatts | two 
--+-
 one  | X
(1 fila)

alvherre=# select 1;
The query must return at least two columns to be shown in crosstab

Please fix this.


Some additional items:

* A few examples in docs.  The psql manpage should have at least two new
examples showing the crosstab features, one with the simplest case you
can think of, and another one showing all the features.

* Add regression test cases somewhere for the regression database.
Probably use "FROM tenk1 WHERE hundred < 5", which provides you with 500
rows, enough for many interesting games.  Make sure to test all the
provided features.  I would use a new psql.sql file for this.

* How did you come up with the 1600 value?  Whatever it is, please use a
#define instead of hardcoding it.

* In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
block (line 497 in the attached), can't we do the same thing by using
psprintf?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d8b9a03..536141c 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+  

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-21 Thread Alvaro Herrera
Robert Haas wrote:
> On Sun, Mar 20, 2016 at 5:27 PM, Pavel Stehule  
> wrote:
> > From my perspective, it is ready for commiter. Daniel solved the most big
> > issues.
> 
> OK, so that brings us back to: is there any committer who likes this
> enough to want to look at committing it?  My view hasn't changed much
> since 
> http://www.postgresql.org/message-id/ca+tgmoz4yaduq9j8xtgrbh868jh2nj_nw_qgkxb32cedsvt...@mail.gmail.com

I volunteer for that, but it'll be a few days so if anyone else is
interested, feel free.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-21 Thread Robert Haas
On Sun, Mar 20, 2016 at 5:27 PM, Pavel Stehule  wrote:
> From my perspective, it is ready for commiter. Daniel solved the most big
> issues.

OK, so that brings us back to: is there any committer who likes this
enough to want to look at committing it?  My view hasn't changed much
since 
http://www.postgresql.org/message-id/ca+tgmoz4yaduq9j8xtgrbh868jh2nj_nw_qgkxb32cedsvt...@mail.gmail.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-20 Thread Pavel Stehule
Hi

2016-03-19 16:31 GMT+01:00 Alvaro Herrera :

> Pavel Stehule wrote:
>
> > Can I do review?
>
> Of course.
>

I did review of last patch. I had to do small changes to run the code due
last Tom's changes in psql. Updated patch is attached.

The last changes in this patch are two:

1. Remove strange server side sorting
2. Cleaning/reducing interface

Other code is +/- without changes. There was lot of discussion in this
thread, I would not to repeat it.

I'll comment the changes:

@1 using server side sorting was really generic, but strange. Now, the
crosstabview works without it without any significant functionality
degradation.

@2 interface is minimalist - but good enough - I am thinking so it is good
start point. I was able to run my examples without problems. The previous
API was more comfortable - "+","-" symbols allows to specify order quickly,
but without a agreement we can live without this feature. Now, a order of
data is controlled fully by SQL. crosstabview does data visualization only.
I have not any objection to this last design. It is reduced to minimum, but
still it works well.

* All regress tests passed
* A code is well and well commented
* No new warnings or compilation issues
* Documentation is clean

I have two minor notes, can be fixed simply, if we accept this last design:

1. can be nice if documentation will contains one example
2. some regress tests

>From my perspective, it is ready for commiter. Daniel solved the most big
issues.

Regards

Pavel


> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index 8a85804..da0621b
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb=
*** 990,995 
--- 990,1102 

  

+ \crosstabview [
+ colV
+ colH
+ [:scolH]
+ [colG1[,colG2...]]
+ ] 
+ 
+ 
+ Execute the current query buffer (like \g) and 
shows
+ the results inside a crosstab grid.
+ The output column colV
+ becomes a vertical header
+ and the output column colH
+ becomes a horizontal header, optionally sorted by ranking data 
obtained
+ from scolH.
+ 
+ colG1[,colG2...]
+ is the list of output columns to project into the grid.
+ By default, all output columns of the query except 
+ colV and
+ colH
+ are included in this list.
+ 
+ 
+ 
+ All columns can be refered to by their position (starting at 1), or by
+ their name. Normal case folding and quoting rules apply on column
+ names. By default,
+ colV corresponds to 
column 1
+ and colH to column 2.
+ A query having only one output column cannot be viewed in crosstab, 
and
+ colH must differ from
+ colV.
+ 
+ 
+ 
+ The vertical header, displayed as the leftmost column,
+ contains the deduplicated values found in
+ column colV, in the same
+ order as in the query results.
+ 
+ 
+ The horizontal header, displayed as the first row,
+ contains the deduplicated values found in
+ column colH, in
+ the order of appearance in the query results.
+ If specified, the optional scolH
+ argument refers to a column whose values should be integer numbers
+ by which colH will be 
sorted
+ to be positioned in the horizontal header.
+ 
+ 
+ 
+ Inside the crosstab grid,
+ given a query output with N columns
+ (including colV and
+ colH),
+ for each distinct value x of
+ colH
+ and each distinct value y of
+ colV,
+ the contents of a cell located at the intersection
+ (x,y) is determined by these rules:
+ 
+ 
+ 
+  if there is no corresponding row in the query results such that the
+  value for colH
+  is x and the value
+  for colV
+  is y, the cell is empty.
+ 
+ 
+ 
+ 
+ 
+  if there is exactly one row such that the value
+  for colH
+  is x and the value
+  for colV
+  is y, then the N-2 other
+  columns or the columns listed in
+  colG1[,colG2...]
+  are displayed in the cell, separated between each other by
+  a space character if needed.
+ 
+  If N=2, the letter X is 
displayed
+  in the cell as if a virtual third column contained that character.
+ 
+ 
+ 
+ 
+ 
+  if there are several corresponding rows, the behavior is identical to
+  the case of one row except that 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-19 Thread Alvaro Herrera
Pavel Stehule wrote:
 
> Can I do review?

Of course.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-19 Thread Pavel Stehule
2016-03-19 15:45 GMT+01:00 Robert Haas :

> On Mon, Mar 14, 2016 at 2:55 PM, Robert Haas 
> wrote:
> > On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite 
> wrote:
> >>> But worse than either of  those things, there is no real
> >>> agreement on what the overall design of this feature
> >>> should be.
> >>
> >> The part in the design that raised concerns upthread is
> >> essentially how headers sorting is exposed to the user and
> >> implemented.
> >>
> >> As suggested in [1], I've made some drastic changes in the
> >> attached patch to take the comments (from Dean R., Tom L.)
> >> into account.
> >> [ ... lengthy explanation ... ]
> >> - also NULLs are no longer excluded from headers, per Peter E.
> >>   comment in [2].
> >
> > Dean, Tom, Peter, what do you think of the new version?
>
> Is anyone up for re-reviewing this?  If not, I think we're going to
> have to reject this for lack of interest.
>

Can I do review?

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-19 Thread Robert Haas
On Mon, Mar 14, 2016 at 2:55 PM, Robert Haas  wrote:
> On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite  
> wrote:
>>> But worse than either of  those things, there is no real
>>> agreement on what the overall design of this feature
>>> should be.
>>
>> The part in the design that raised concerns upthread is
>> essentially how headers sorting is exposed to the user and
>> implemented.
>>
>> As suggested in [1], I've made some drastic changes in the
>> attached patch to take the comments (from Dean R., Tom L.)
>> into account.
>> [ ... lengthy explanation ... ]
>> - also NULLs are no longer excluded from headers, per Peter E.
>>   comment in [2].
>
> Dean, Tom, Peter, what do you think of the new version?

Is anyone up for re-reviewing this?  If not, I think we're going to
have to reject this for lack of interest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Robert Haas
On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite  wrote:
>> But worse than either of  those things, there is no real
>> agreement on what the overall design of this feature
>> should be.
>
> The part in the design that raised concerns upthread is
> essentially how headers sorting is exposed to the user and
> implemented.
>
> As suggested in [1], I've made some drastic changes in the
> attached patch to take the comments (from Dean R., Tom L.)
> into account.
> [ ... lengthy explanation ... ]
> - also NULLs are no longer excluded from headers, per Peter E.
>   comment in [2].

Dean, Tom, Peter, what do you think of the new version?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Daniel Verite
Jim Nasby wrote:

> Ultimately I'd really like some way to remove/reduce the restriction of 
> result set definitions needing to be determined at plan time. That would 
> open the door for server-side crosstab/pivot as well a a host of other 
> things (such as dynamically turning a hstore/json/xml field into a 
> recordset).

> Ultimately I'd really like some way to remove/reduce the restriction of 
> result set definitions needing to be determined at plan time. That would 
> open the door for server-side crosstab/pivot as well a a host of other 
> things (such as dynamically turning a hstore/json/xml field into a 
> recordset).

That would go against a basic expectation of prepared statements, the
fact that queries can be parsed/prepared without any part of them
being executed.

For a dynamic pivot, but probably also for the other examples you
have in mind, the SQL engine wouldn't be able to determine the output
columns without executing a least a subselect to look inside some
table(s).

I suspect that the implications of this would be so far reaching and
problematic that it will just not happen.

It seems to me that a dynamic pivot will always consist of
two SQL queries that can never be combined into one,
unless using a workaround à la Oracle, which encapsulates the
entire dynamic resultset into an XML blob as output.
The problem here being that the client-side tools
that people routinely use are not equipped to process it anyway;
at least that's what  I find by anecdotal evidence for instance in:
https://community.oracle.com/thread/2133154?tstart=0
 or
http://stackoverflow.com/questions/19298424
 or
https://community.oracle.com/thread/2388982?tstart=0


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Alvaro Herrera
Jim Nasby wrote:
> On 3/13/16 12:48 AM, Pavel Stehule wrote:
> >crosstabview is really visualization tool. **But now, there are not any
> >other tool available from terminal.** So this can be significant help to
> >all people who would to use this functionality.
> 
> Not just the terminal either. Offhand I'm not aware of *any* fairly simple
> tool that provides crosstab. There's a bunch of complicated/expensive BI
> tools that do, but unless you've gone through the trouble of getting one of
> those setup you're currently pretty stuck.

I'm definitely +1 for this feature in psql also.

Some years ago we had a discussion about splitting psql in two parts, a
bare-bones one which would help script-writing and another one with
fancy features; we decided to keep one tool to rule them all and made
the implicit decision that we would grow exotic, sophisticated features
into psql.  ISTM that this patch is going in that direction.

> Ultimately I'd really like some way to remove/reduce the restriction of
> result set definitions needing to be determined at plan time. That would
> open the door for server-side crosstab/pivot as well a a host of other
> things (such as dynamically turning a hstore/json/xml field into a
> recordset).

That seems so far down the road that I don't think it should block the
psql feature being proposed in this thread, but yes I would like that
one too.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-13 Thread Jim Nasby

On 3/13/16 12:48 AM, Pavel Stehule wrote:

crosstabview is really visualization tool. **But now, there are not any
other tool available from terminal.** So this can be significant help to
all people who would to use this functionality.


Not just the terminal either. Offhand I'm not aware of *any* fairly 
simple tool that provides crosstab. There's a bunch of 
complicated/expensive BI tools that do, but unless you've gone through 
the trouble of getting one of those setup you're currently pretty stuck.


Ultimately I'd really like some way to remove/reduce the restriction of 
result set definitions needing to be determined at plan time. That would 
open the door for server-side crosstab/pivot as well a a host of other 
things (such as dynamically turning a hstore/json/xml field into a 
recordset).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Pavel Stehule
2016-03-11 14:49 GMT+01:00 Robert Haas :

> On Thu, Feb 18, 2016 at 9:23 AM, Daniel Verite 
> wrote:
> > Dean Rasheed wrote:
> >
> >> If I want to sort the rows coming out of a query, my first thought
> >> is always going to be to add/adjust the query's ORDER BY clause, not
> >> use some weird +/- psql syntax.
> >
> > About the vertical sort, I agree on all your points.
> > It's best to rely on ORDER BY for all the reasons mentioned,
> > as opposed to a separate sort in a second step.
> >
> > But you're considering the case when a user is designing
> > or adapting a query for the purpose of crosstab
> > viewing. As mentioned in my previous reply (about the
> > methods to achieve horizontal sort), that scenario is not really
> > what motivates the feature in the first place.
> >
> > If removing that sort option is required to move forward
> > with the patch because it's controversial, so be it,
> > but overall I don't see this as a benefit for the end user,
> > it's just an option.
>
> Discussion on this patch seems to have died off.  I'm probably not
> going to win any popularity contests for saying this, but I think we
> should reject this patch.  I don't feel like this is really a psql
> feature: it's a powerful data visualization tool which we're proposing
> to jam into psql.  I don't think that's psql's purpose.  I also think
> it's quite possible that there could be an unbounded number of
> slightly different things that people want here, and if we take this
> one and a couple more, the code for these individual features could
> come to be larger than all of psql, even though probably 95% of psql
> users would never use any of those.
>

crosstabview is really visualization tool. **But now, there are not any
other tool available from terminal.** So this can be significant help to
all people who would to use this functionality.

The psql has lot of features for 5% users. Currently it is famous not as
"bloated software" but like most comfortable sql console on the world. The
implementation of crosstabview is not complex and with last Daniel's
modification the complexity is less.

The crosstabview is not 100% equal to ANSI SQL PIVOT clause. The ANSI SQL
command is much more rigid (it is one stage statement with predefined
columns), so argument of duplicate implementation one things is not valid.
Probably we would not implement non ANSI SQL feature on server.

Regards

Pavel


>
> Now, that having been said, if other people want this feature to go in
> and are willing to do the work to get it in, I've said my piece and
> won't complain further.  There are a couple of committers who have
> taken positive interest in this thread, so that's good.  However,
> there are also a couple of committers who have expressed doubts
> similar to mine, so that's not so good.  But worse than either of
> those things, there is no real agreement on what the overall design of
> this feature should be.  Everybody wants something a little different,
> for different reasons.  If we can't come to an agreement, more or less
> immediately, on what to try to get into 9.6, then this can't go into
> this release.  Whether it should go into a future release is a
> question we can leave for another time.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Daniel Verite
Robert Haas wrote:

> But worse than either of  those things, there is no real
> agreement on what the overall design of this feature
> should be.

The part in the design that raised concerns upthread is
essentially how headers sorting is exposed to the user and
implemented.

As suggested in [1], I've made some drastic changes in the
attached patch to take the comments (from Dean R., Tom L.)
into account. The idea is to limit to the bare minimum
the involvement of psql in sorting:

- the +/- syntax goes away

- the possibility of post-sorting the values through a backdoor
  query goes away too, for both headers.

- the vertical order of the crosstab view is now driven solely by the
  order  in the query

- the order of the horizontal header can be optionally specified
  by a column expected to contain an integer, with the syntax
  \crosstabview colv colh:scolh [other cols]
  which means "colh" will be sorted by "scolh".
  It still defaults to whatever order "colh" comes in from the results

  Concerning the optional "scolh", there are cases where it might pre-exist
  naturally, such as a month number going in pair with a month name.
  In other cases,  a user may add it as a kind of "synthetic column"
  by way of a window function, for example:
SELECT ...other columns...,
   (row_number() over(order by something [order options]) as scolh
   FROM...
   Only the relative order of scolh values is taken into account, the value
itself
   has no meaning for crosstabview.

- also NULLs are no longer excluded from headers, per Peter E.
  comment in [2].


[1]
http://www.postgresql.org/message-id/3d513263-104b-41e3-b1c7-4ad4bd99c491@mm

[2] http://www.postgresql.org/message-id/56c4e344.6070...@gmx.net


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8a85804..da0621b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+  

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-11 Thread Robert Haas
On Thu, Feb 18, 2016 at 9:23 AM, Daniel Verite  wrote:
> Dean Rasheed wrote:
>
>> If I want to sort the rows coming out of a query, my first thought
>> is always going to be to add/adjust the query's ORDER BY clause, not
>> use some weird +/- psql syntax.
>
> About the vertical sort, I agree on all your points.
> It's best to rely on ORDER BY for all the reasons mentioned,
> as opposed to a separate sort in a second step.
>
> But you're considering the case when a user is designing
> or adapting a query for the purpose of crosstab
> viewing. As mentioned in my previous reply (about the
> methods to achieve horizontal sort), that scenario is not really
> what motivates the feature in the first place.
>
> If removing that sort option is required to move forward
> with the patch because it's controversial, so be it,
> but overall I don't see this as a benefit for the end user,
> it's just an option.

Discussion on this patch seems to have died off.  I'm probably not
going to win any popularity contests for saying this, but I think we
should reject this patch.  I don't feel like this is really a psql
feature: it's a powerful data visualization tool which we're proposing
to jam into psql.  I don't think that's psql's purpose.  I also think
it's quite possible that there could be an unbounded number of
slightly different things that people want here, and if we take this
one and a couple more, the code for these individual features could
come to be larger than all of psql, even though probably 95% of psql
users would never use any of those.

Now, that having been said, if other people want this feature to go in
and are willing to do the work to get it in, I've said my piece and
won't complain further.  There are a couple of committers who have
taken positive interest in this thread, so that's good.  However,
there are also a couple of committers who have expressed doubts
similar to mine, so that's not so good.  But worse than either of
those things, there is no real agreement on what the overall design of
this feature should be.  Everybody wants something a little different,
for different reasons.  If we can't come to an agreement, more or less
immediately, on what to try to get into 9.6, then this can't go into
this release.  Whether it should go into a future release is a
question we can leave for another time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Dean Rasheed wrote:

> If I want to sort the rows coming out of a query, my first thought
> is always going to be to add/adjust the query's ORDER BY clause, not
> use some weird +/- psql syntax.

About the vertical sort, I agree on all your points.
It's best to rely on ORDER BY for all the reasons mentioned,
as opposed to a separate sort in a second step.

But you're considering the case when a user is designing
or adapting a query for the purpose of crosstab
viewing. As mentioned in my previous reply (about the
methods to achieve horizontal sort), that scenario is not really
what motivates the feature in the first place.

If removing that sort option is required to move forward
with the patch because it's controversial, so be it,
but overall I don't see this as a benefit for the end user,
it's just an option.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Peter Eisentraut wrote:

> On 2/9/16 11:21 AM, Daniel Verite wrote:
> > Note that NULL values in the column that pivots are discarded
> > by \crosstabview, because NULL as the name of a column does not
> > make sense.
> 
> Why not?
> 
> All you're doing is printing it out, and psql is quite capable of
> printing a null value.

Initially it's by analogy with the crosstab SRF, but it's true
that the same principle does not have to apply to crosstabview.

The code could set in the header whatever text "pset null" is set to,
at the place where a pivoted NULL would be supposed to go
if it was not filtered out in the first place.

I'll consider implementing that change if there's no objection.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Daniel Verite wrote:

> > >   ORDER BY name
> > > \crosstabview cols = (select to_char(d, 'Mon') from
> > > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> > 
> > My concern with that is that often you don't know what the columns will 
> > be, because you don't know what exact data the query will produce. So to 
> > use this syntax you'd have to re-create a huge chunk of the original 
> > query. :(
> 
> Also, if that additional query refers to tables, it should be executed
> with the same data visibility as the main query. Doesn't that mean
> that both queries should happen within the same repeatable
> read transaction?
> 
> Another  impractical aspect of this approach is that a
> meta-command invocation in psql must fit on a single line, so
> queries containing newlines are not acceptable as argument.
> This problem exists with "\copy (select...) to ..."  already.

Thinking more about that, it occurs to me that if the sort must come
from a user-supplied bit of SQL, it would be simpler to just direct the
user to submit it in the main query, in an additional dedicated column.

For instance, to get a specific, separate order on "h",
let the user change this:

  SELECT v, h, c FROM v_data ORDER BY v;

into that:

  SELECT v, h, row_number() over(order by h) as hn, c
   FROM v_data ORDER BY v;

then with a relatively simple modification to the patch,
this invocation:

 \crosstabview v h:hn c

would display "h" in the horizontal header ordered by "hn".

ISTM this handles two objections raised upthread:

1. The ORDER BY inside OVER() can be augmented with additional
clauses such as lc_collate, desc, nulls last, etc... contrary to
the controversed "+/-" syntax.

2. a post-sort "backdoor" query is no longer necessary.

The drawback for me is that this change doesn't play out with
my original scenario for the command, which is to give the ability to
scrutinize query results in crosstab mode, playing with variations on
what column is pivoted and how headers for both directions get sorted,
while ideally not changing _at all_ the original query in the query
buffer, but just invoking  successive \crosstabview [args] commands
with varying arguments.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Jim Nasby

On 2/17/16 9:03 AM, Dean Rasheed wrote:

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.


Yeah. If there was some magic way to reference the underlying data with 
your syntax it probably wouldn't be that bad. AIUI normally we're just 
dumping data into a Portal and there's no option to read back from it, 
but if the query results were first put in a tuplestore then I suspect 
it wouldn't be that hard to query against it and produce another result set.



What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.


+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Peter Eisentraut
On 2/9/16 11:21 AM, Daniel Verite wrote:
> Note that NULL values in the column that pivots are discarded
> by \crosstabview, because NULL as the name of a column does not
> make sense.

Why not?

All you're doing is printing it out, and psql is quite capable of
printing a null value.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Daniel Verite
Jim Nasby wrote:

> >   ORDER BY name
> > \crosstabview cols = (select to_char(d, 'Mon') from
> > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> 
> My concern with that is that often you don't know what the columns will 
> be, because you don't know what exact data the query will produce. So to 
> use this syntax you'd have to re-create a huge chunk of the original 
> query. :(

Also, if that additional query refers to tables, it should be executed
with the same data visibility as the main query. Doesn't that mean
that both queries should happen within the same repeatable
read transaction?

Another  impractical aspect of this approach is that a
meta-command invocation in psql must fit on a single line, so
queries containing newlines are not acceptable as argument.
This problem exists with "\copy (select...) to ..."  already.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed
On 17 February 2016 at 02:32, Jim Nasby  wrote:
> On 2/11/16 4:21 AM, Dean Rasheed wrote:
>>
>> Thinking about this some more though, perhaps*sorting*  the columns is
>> the wrong way to be thinking about it. Perhaps a better approach would
>> be to allow the columns to be*listed*  (optionally, using a separate
>> query). Something like the following (don't get too hung up on the
>> syntax):
>>
>> SELECT name,
>> to_char(date, 'Mon') AS month,
>> sum(amount) AS amount
>>   FROM invoices
>>   GROUP BY 1,2
>>   ORDER BY name
>> \crosstabview cols = (select to_char(d, 'Mon') from
>> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>
>
> My concern with that is that often you don't know what the columns will be,
> because you don't know what exact data the query will produce. So to use
> this syntax you'd have to re-create a huge chunk of the original query. :(
>

Yeah, that's a reasonable concern.

On the flip side, one of the advantages of the above syntax is that
you have absolute control over the columns, whereas with the
sort-based syntax you might find some columns missing (e.g., if there
were no invoices in August) and that could lead to confusion parsing
the results.

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.

What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed
On 15 February 2016 at 14:08, Daniel Verite  wrote:
> Dean Rasheed wrote:
>
>> My biggest problem is with the sorting, for all the reasons discussed
>> above. There is absolutely no reason for \crosstabview to be
>> re-sorting rows -- they should just be left in the original query
>> result order
>
> Normal top-down display:
>
> select v,to_char(d,'Mon') as m, c  from v_data order by d;
>
>  v  |  m  |  c
> +-+-
>  v2 | Jan | bar
>  v1 | Apr | foo
>  v1 | Jul | baz
>  v0 | Jul | qux
>
> At this point, it seems to me that it's perfectly reasonable for our user
> to expect the possibility of sorting additionally by "v" , without
> changing the query and without changing the order of the horizontal
> header:
>
>  \crosstabview +v m c
>
>  v  | Jan | Apr | Jul
> +-+-+-
>  v0 | | | qux
>  v1 | | foo | baz
>  v2 | bar | |
>

I don't find that example particularly compelling. If I want to sort
the rows coming out of a query, my first thought is always going to be
to add/adjust the query's ORDER BY clause, not use some weird +/- psql
syntax.

The crux of the problem here is that in a pivoted query resultset SQL
can be used to control the order of the rows or the columns, but not
both at the same time. IMO it is more natural to use SQL to control
the order of the rows. The columns are the result of the psql
pivoting, so it's reasonable to control them via psql options.

A couple of other points to bear in mind:

The number of columns is always going to be quite limited (at most
1600, and usually far less than that), whereas the number of rows
could be arbitrarily large. So sorting the rows client-side in the way
that you are could get very inefficient, whereas that's not such a
problem for the columns.

The column values are non-NULL, so they require a more limited set of
sort options, whereas the rows could be anything, and people will want
all the sort options to be available.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-16 Thread Jim Nasby

On 2/11/16 4:21 AM, Dean Rasheed wrote:

Thinking about this some more though, perhaps*sorting*  the columns is
the wrong way to be thinking about it. Perhaps a better approach would
be to allow the columns to be*listed*  (optionally, using a separate
query). Something like the following (don't get too hung up on the
syntax):

SELECT name,
to_char(date, 'Mon') AS month,
sum(amount) AS amount
  FROM invoices
  GROUP BY 1,2
  ORDER BY name
\crosstabview cols = (select to_char(d, 'Mon') from
generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)


My concern with that is that often you don't know what the columns will 
be, because you don't know what exact data the query will produce. So to 
use this syntax you'd have to re-create a huge chunk of the original 
query. :(

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-15 Thread Daniel Verite
Dean Rasheed wrote:

> My biggest problem is with the sorting, for all the reasons discussed
> above. There is absolutely no reason for \crosstabview to be
> re-sorting rows -- they should just be left in the original query
> result order

We want the option to sort the vertical the header in a late additional
step when the ORDER BY of the query is already assigned to another
purpose. 

I've submitted this example on the wiki:
https://wiki.postgresql.org/wiki/Crosstabview

create view v_data as 
select * from ( values
   ('v1','h2','foo', '2015-04-01'::date),
   ('v2','h1','bar', '2015-01-02'),
   ('v1','h0','baz', '2015-07-12'),
   ('v0','h4','qux', '2015-07-15')
 ) as l(v,h,c,d);


Normal top-down display:

select v,to_char(d,'Mon') as m, c  from v_data order by d;

 v  |  m  |  c  
+-+-
 v2 | Jan | bar
 v1 | Apr | foo
 v1 | Jul | baz
 v0 | Jul | qux

Crosstabview display without any additional sort:

 \crosstabview v m c

 v  | Jan | Apr | Jul 
+-+-+-
 v2 | bar | | 
 v1 | | foo | baz
 v0 | | | qux

"d" is not present the resultset but it drives the sort
so that month names come out in the natural order.

\crosstabview does not discard the order of colH nor the order of colV,
it follows both, so that we get v2,v1,v0 in this order in the leftmost
column (vertical header) just like in the resultset.

At this point, it seems to me that it's perfectly reasonable for our user
to expect the possibility of sorting additionally by "v" , without
changing the query and without changing the order of the horizontal
header:

 \crosstabview +v m c

 v  | Jan | Apr | Jul 
+-+-+-
 v0 | | | qux
 v1 | | foo | baz
 v2 | bar | | 



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-15 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

I've added a wiki page with explanation and examples here:

https://wiki.postgresql.org/wiki/Crosstabview

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Andres Freund
On 2016-02-09 09:27:04 +, Dean Rasheed wrote:
> Looking at this patch, I have mixed feelings about it. On the one hand
> I really like the look of the output, and I can see that the non-fixed
> nature of the output columns makes this hard to achieve server-side.

> But on the other hand, this seems to be going way beyond the normal
> level of result formatting that something like \x does, and I find the
> syntax for sorting particularly ugly.

I've pretty similar doubts. Addinging features to psql which are complex
enough that it's likely that people will be forced to parse psql
output...  On the other hand, a proper server side solution won't be
easy; so maybe this is a okay enough stopgap.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Dean Rasheed
On 11 February 2016 at 08:43, Andres Freund  wrote:
> On 2016-02-09 09:27:04 +, Dean Rasheed wrote:
>> Looking at this patch, I have mixed feelings about it. On the one hand
>> I really like the look of the output, and I can see that the non-fixed
>> nature of the output columns makes this hard to achieve server-side.
>
>> But on the other hand, this seems to be going way beyond the normal
>> level of result formatting that something like \x does, and I find the
>> syntax for sorting particularly ugly.
>
> I've pretty similar doubts. Addinging features to psql which are complex
> enough that it's likely that people will be forced to parse psql
> output...  On the other hand, a proper server side solution won't be
> easy; so maybe this is a okay enough stopgap.
>

Well to be clear, I like the idea of this feature, and I'm not trying
to stand in the way of progressing it. However, I can't see myself
committing it in its current form.

My biggest problem is with the sorting, for all the reasons discussed
above. There is absolutely no reason for \crosstabview to be
re-sorting rows -- they should just be left in the original query
result order. Sorting columns is a little more understandable, since
there is no way for the original query to control the order in which
the colV values come out, but Tom raises a good point -- there are far
too many bells and whistles when it comes to sorting, and we don't
want to be adding all of them to the psql syntax.

Thinking about this some more though, perhaps *sorting* the columns is
the wrong way to be thinking about it. Perhaps a better approach would
be to allow the columns to be *listed* (optionally, using a separate
query). Something like the following (don't get too hung up on the
syntax):

SELECT name,
   to_char(date, 'Mon') AS month,
   sum(amount) AS amount
 FROM invoices
 GROUP BY 1,2
 ORDER BY name
\crosstabview cols = (select to_char(d, 'Mon') from
generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Pavel Stehule
Thinking about this some more though, perhaps *sorting* the columns is
> the wrong way to be thinking about it. Perhaps a better approach would
> be to allow the columns to be *listed* (optionally, using a separate
> query). Something like the following (don't get too hung up on the
> syntax):
>
> SELECT name,
>to_char(date, 'Mon') AS month,
>sum(amount) AS amount
>  FROM invoices
>  GROUP BY 1,2
>  ORDER BY name
> \crosstabview cols = (select to_char(d, 'Mon') from
> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>

The idea is ok, but this design cannot be described as user friendly. The
work with time dimension is pretty common, and should be supported by some
short user friendly syntax.

Regards

Pavel


>
> Regards,
> Dean
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
Hi

Looking at this patch, I have mixed feelings about it. On the one hand
> I really like the look of the output, and I can see that the non-fixed
> nature of the output columns makes this hard to achieve server-side.
>
> But on the other hand, this seems to be going way beyond the normal
> level of result formatting that something like \x does, and I find the
> syntax for sorting particularly ugly. I can understand the need to
> sort the colH values, but it seems to me that the result rows should
> just be returned in the order the server returns them -- i.e., I don't
> think we should allow sorting colV values client-side, overriding a
> server-side ORDER BY clause in the query.
>

This feature has zero relation with \x option, and any link to this option
is confusing. This is important, elsewhere we are on start again, where I
did long discuss with Daniel about the name, when I blocked the name
"rotate".


> Client-side sorting makes me uneasy in general, and I think it should
> be restricted to just sorting the columns that appear in the output
> (the colH values). This would also allow the syntax to be simplified:
>
> \crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]]
>

The sorting on client side is necessary - minimally in one direction,
because you cannot to create perfect sorting for both dimensions.
Possibility to order in second dimension is just pretty comfortable -
because you don't need to think two steps forward - when you create SQL
query.

I have a basic use case that should be supported well, and it is supported
well by last version of this patch. The evaluation of syntax is subjective.
We can compare Daniel's syntax and your proposal.

The use case: I have a table with the invoices with attributes (date, name
and amount). I would to take a report of amounts across months and
customers. Horizontal dimension is month (name), vertical dimension is name
of customers. I need sorting of months in semantic order and customers in
alphabet order.

So my query is:

SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;

and crosstabview command (per Daniel proposal)

\crosstabview +name  +month:month_order amount

But if I don't need column header in human readable form, I can do

\crosstabview +name +month_order amount

What is solution of this use case with your proposal??

I agree so this syntax is pretty raw. But it is consistent with other psql
statements and there are not possible conflicts.

What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
[colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order
column, I have to enter one or more colG1,... or I have to enter explicitly
asc, desc keyword.

Regards

Pavel








>
> Overall, I like the feature, but I'm not convinced that it's ready in
> its current form.
>
> For the future (not in this first version of the patch), since the
> transformation is more than just a \x-type formatting of the query
> results, a nice-to-have feature would be a way to save the results
> somewhere -- say by making it play nicely with \g or \copy somehow,
> but I admit that I don't know exactly how that would work.
>
> Regards,
> Dean
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 05:24, Pavel Stehule  wrote:
> I have not a feeling so we did some with Daniel privately. All work was
> public (I checked my mailbox) - but what is unhappy - in more mailing list
> threads (not sure how it is possible, because subjects looks same). The
> discus about the design was public, I am sure. It was relative longer
> process, with good progress (from my perspective), because Daniel accepts
> and fixed all my objection. The proposed syntax is fully consistent with
> other psql commands - hard to create something new there, because psql
> parser is pretty limited. Although I am thinking so syntax is good, clean
> and useful I am open to discuss about it. Please, try the last design, last
> patch - I spent lot of hours (and I am sure so Daniel much more) in thinking
> how this can be designed better.
>

Looking at this patch, I have mixed feelings about it. On the one hand
I really like the look of the output, and I can see that the non-fixed
nature of the output columns makes this hard to achieve server-side.

But on the other hand, this seems to be going way beyond the normal
level of result formatting that something like \x does, and I find the
syntax for sorting particularly ugly. I can understand the need to
sort the colH values, but it seems to me that the result rows should
just be returned in the order the server returns them -- i.e., I don't
think we should allow sorting colV values client-side, overriding a
server-side ORDER BY clause in the query.

Client-side sorting makes me uneasy in general, and I think it should
be restricted to just sorting the columns that appear in the output
(the colH values). This would also allow the syntax to be simplified:

\crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]]

Overall, I like the feature, but I'm not convinced that it's ready in
its current form.

For the future (not in this first version of the patch), since the
transformation is more than just a \x-type formatting of the query
results, a nice-to-have feature would be a way to save the results
somewhere -- say by making it play nicely with \g or \copy somehow,
but I admit that I don't know exactly how that would work.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 10:09, Pavel Stehule  wrote:
> The sorting on client side is necessary - minimally in one direction,
> because you cannot to create perfect sorting for both dimensions.
> Possibility to order in second dimension is just pretty comfortable -
> because you don't need to think two steps forward - when you create SQL
> query.
>
> I have a basic use case that should be supported well, and it is supported
> well by last version of this patch. The evaluation of syntax is subjective.
> We can compare Daniel's syntax and your proposal.
>
> The use case: I have a table with the invoices with attributes (date, name
> and amount). I would to take a report of amounts across months and
> customers. Horizontal dimension is month (name), vertical dimension is name
> of customers. I need sorting of months in semantic order and customers in
> alphabet order.
>
> So my query is:
>
> SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
> month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;
>
> and crosstabview command (per Daniel proposal)
>
> \crosstabview +name  +month:month_order amount
>
> But if I don't need column header in human readable form, I can do
>
> \crosstabview +name +month_order amount
>
> What is solution of this use case with your proposal??
>

So it would just be

SELECT name,
   to_char(date, 'mon') AS month,
   sum(amount) AS amount,
   extract(month from date) AS month_order
 FROM invoices
 GROUP BY 1,2,3
 ORDER BY name
\crosstabview name month amount month_order

Note that I might also want to pass additional sort options, such as
"ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
In the new syntax, such sort options could be trivially supported in
both the server- and client-side sorts:

SELECT name, to_char(date, 'mon') AS month,
   extract(month from date) AS month_order, sum(amount) AS amount
  FROM invoices
 GROUP BY 1,2,3
 ORDER BY name NULLS LAST
\crosstabview name month amount month_order asc nulls last

This is probably not an issue in this example, but it might well be in
other cases. The +/-scol syntax is always going to be limited in what
it can support.


> I agree so this syntax is pretty raw. But it is consistent with other psql
> statements and there are not possible conflicts.
>
> What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
> [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order
> column, I have to enter one or more colG1,... or I have to enter explicitly
> asc, desc keyword.
>

That is resolved by the comma that precedes colG2, etc. isn't it?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
> >
> > SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
> > month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;
> >
> > and crosstabview command (per Daniel proposal)
> >
> > \crosstabview +name  +month:month_order amount
> >
> > But if I don't need column header in human readable form, I can do
> >
> > \crosstabview +name +month_order amount
> >
> > What is solution of this use case with your proposal??
> >
>
> So it would just be
>
> SELECT name,
>to_char(date, 'mon') AS month,
>sum(amount) AS amount,
>extract(month from date) AS month_order
>  FROM invoices
>  GROUP BY 1,2,3
>  ORDER BY name
> \crosstabview name month amount month_order
>

Warning: :) Now I am subjective. The Daniel syntax "\crosstabview +name
+month:month_order amount" looks more readable for me, because related
things are near to self.


>
> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:


> SELECT name, to_char(date, 'mon') AS month,
>extract(month from date) AS month_order, sum(amount) AS amount
>   FROM invoices
>  GROUP BY 1,2,3
>  ORDER BY name NULLS LAST
> \crosstabview name month amount month_order asc nulls last
>

I understand - if I compare these two syntaxes I and I am trying be
objective, then I see

your:
  + respect SQL clauses ordering, allows pretty complex ORDER BY clause
  - possible to fail on unexpected syntax errors
  +/- more verbose
  - allow only one client side sort
  - less expressive

Daniel:
  + cannot to fail on syntax error
  + more compacts (not necessary to specify ORDER BY clauses)
  + allow to specify sort in both dimensions
  + more expressive (+colH is more expressive than colV colH col colH
  - doesn't allow to complex order clauses in both dimensions


>
> This is probably not an issue in this example, but it might well be in
> other cases. The +/-scol syntax is always going to be limited in what
> it can support.
>

the +/- syntax can be enhanced by additional attributes - this is only
syntax (but then there is a risk of possible syntax errors)


>
>
> > I agree so this syntax is pretty raw. But it is consistent with other
> psql
> > statements and there are not possible conflicts.
> >
> > What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
> > [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort
> order
> > column, I have to enter one or more colG1,... or I have to enter
> explicitly
> > asc, desc keyword.
> >
>
> That is resolved by the comma that precedes colG2, etc. isn't it?
>

but colG1 is optional. What if you miss any colGx ?

Regards

Pavel


>
> Regards,
> Dean
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

Sure, I'm on it.

> I'm closing this as returned-with-feedback for now.

Well,  the feedback it got during months was incorporated into
the patch in the form of significant improvements, and
at the end of this CF it was at the point that it has really been
polished, and no other feedback was coming.

I'll resubmit.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> Also, what about the business of putting "x" if there's no third column?
> Three months from now some Czech psql hacker will say "we should use
> Unicode chars for this" and we will be forever stuck with \pset
> unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
> CZECH.  Maybe we should think that a bit harder -- for example, what
> about just rejecting the case with no third column and forcing the user
> to add a third column with the character they choose?  That way you
> avoid that mess.

Yes,  that implicit "X" with 2 column resultsets is not essential, it may
be removed without real damage.

About the possible suggestion to have a \pset unicode_crosstab_marker,
my opinion would be that it's not important enough to justify a new
\pset setting.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> While I understand that you may think that "silence is consent",
> what I am afraid of is that some committer will look at this two
> months from now and say "I hate this Hcol+ stuff, -1 from me" and
> send the patch back for syntax rework.  IMO it's better to have more
> people chime in here so that the patch that we discuss during the
> next commitfest is really the best one we can think of.

Yes, but on the other hand we can't force people to participate.
If a patch is moving forward and being discussed here between
one author and one reviewer, and nothing particularly wrong
pops out in what is discussed, the reality if that other people will
not intervene.

Besides, as it being mentioned here frequently, all patches, even
much more important ones, are short on reviews and reviewers
and testing, still new stuff must keep getting in the source tree
to progress.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 11:06, Pavel Stehule  wrote:
>   + respect SQL clauses ordering, allows pretty complex ORDER BY clause

That, to me is the key point. SQL already allows very powerful
sorting, so psql should not just throw away the query's sort order and
replace it with something much more basic and limited. The exact
syntax can be debated, but I don't think psql should be doing row
sorting.

I also don't believe that extending the +/- sort syntax to support
more advanced options will be particularly easy, and the result is
likely to be even less readable. It also requires the user to learn
another syntax, when they will already be familiar with SQL's sort
syntax.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> I don't think we should allow sorting colV values client-side,
> overriding a server-side ORDER BY clause in the query.

I shared that opinion until (IIRC) the v8 or v9 of the patch.
Most of the evolution of this patch has been to go
from no client-side sorting option at all, to the full range
of possibilities, ascending or descending, and in both
vertical and horizontal directions.

I agree that colV sorting can be achieved through the
query's ORDER BY, which additionally is more efficient
so it should be the primary choice.

The reason to allow [+/-]colV in \crosstabview is because
I think the average user will expect it, by symmetry with colH.
As the display is reorganized to be like a "grid" instead of a "list
with several columns", we shift the focus to the symmetry
between horizontal and vertical headers, rather than on
the pre-crosstab form of the resultset, even if it's the
same data.
It's easier for the user to just stick a + in front of a column
reference than to figure out that the same result could
be achieved by editing the query and changing/adding
an ORDER BY.

Or said otherwise, having the [+/-] colV sorting is a way to
avoid the question:
"we can sort the horizontal header, so why can't we sort the
vertical header just the same?"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:

Note that NULL values in the column that pivots are discarded
by \crosstabview, because NULL as the name of a column does not
make sense.

The doc (in the patch) says:

"The horizontal header, displayed as the first row, contains the set of
all distinct non-null values found in column colH"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Tom Lane
"Daniel Verite"  writes:
>   Dean Rasheed wrote:
>> I don't think we should allow sorting colV values client-side,
>> overriding a server-side ORDER BY clause in the query.

> I shared that opinion until (IIRC) the v8 or v9 of the patch.
> Most of the evolution of this patch has been to go
> from no client-side sorting option at all, to the full range
> of possibilities, ascending or descending, and in both
> vertical and horizontal directions.

I haven't been paying attention to this thread ... but it is sure
sounding like this feature has gotten totally out of hand.  Suggest
reconsidering your design goals.

> Or said otherwise, having the [+/-] colV sorting is a way to
> avoid the question:
> "we can sort the horizontal header, so why can't we sort the
> vertical header just the same?"

I would turn that around, and ask why not remove *both* those things.

I do not think we want any client-side sorting in this feature at all,
because the minute you have any such thing, you are going to have an
absolutely never-ending stream of demands for more sorting features:
multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.
I'd rather reject the feature altogether than expect that psql is going
to have to grow all of that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
I haven't been paying attention to this thread ... but it is sure
> sounding like this feature has gotten totally out of hand.  Suggest
> reconsidering your design goals.
>
> > Or said otherwise, having the [+/-] colV sorting is a way to
> > avoid the question:
> > "we can sort the horizontal header, so why can't we sort the
> > vertical header just the same?"
>
> I would turn that around, and ask why not remove *both* those things.
>
> I do not think we want any client-side sorting in this feature at all,
> because the minute you have any such thing, you are going to have an
> absolutely never-ending stream of demands for more sorting features:
> multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.
> I'd rather reject the feature altogether than expect that psql is going
> to have to grow all of that.
>

I am thinking so without possibility to sort data on client side, this
feature will be significantly limited. You cannot do server side sort for
both dimensions. Working with 2d report when one dimension is unsorted is
not friendly.

But the client side sorting can be limited to number's or C locale sorting.
I don't think so full sort possibilities are necessary.

Regards

Pavel


> regards, tom lane
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Jim Nasby

On 2/9/16 8:40 AM, Daniel Verite wrote:

Alvaro Herrera wrote:


While I understand that you may think that "silence is consent",
what I am afraid of is that some committer will look at this two
months from now and say "I hate this Hcol+ stuff, -1 from me" and
send the patch back for syntax rework.  IMO it's better to have more
people chime in here so that the patch that we discuss during the
next commitfest is really the best one we can think of.


Yes, but on the other hand we can't force people to participate.
If a patch is moving forward and being discussed here between
one author and one reviewer, and nothing particularly wrong
pops out in what is discussed, the reality if that other people will
not intervene.


The problem is that assumes people are still reading the thread. This is 
a feature I'm very interested, but at some point I just gave up on 
trying to follow it because of the volume of messages. I bet a lot of 
others did the same.


I think in this case, what should have happened is that once an issue 
with the design of the feature itself was identified, a new thread 
should have been started to discuss that part in particular. That would 
have re-raised attention and made it easier for people to follow that 
specific part of the discussion, even if they don't care about some if 
the code intricacies.



Besides, as it being mentioned here frequently, all patches, even
much more important ones, are short on reviews and reviewers
and testing, still new stuff must keep getting in the source tree
to progress.


Sure, and new stuff will be making it in. The question is: will *your* 
new stuff be making it in?


Believe me, I know how burdensome getting new features pushed is. 
Frankly it shouldn't be this hard, and I certainly don't blame you for 
being frustrated. But none of that changes the fact that the bar for 
including code is very high and if you don't meet it then your stuff 
won't make it in.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Tom Lane wrote:

> I do not think we want any client-side sorting in this feature at all,
> because the minute you have any such thing, you are going to have an
> absolutely never-ending stream of demands for more sorting features:
> multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.

It doesn't really do any client-side sorting, the rest of the thread might
refer to it like that by oversimplification, but if the command requests
a header to be sorted,  a "backdoor-style" query of this form
is sent to the server, with PQexecParams():

SELECT n FROM (VALUES ($1,1),($2,2),($3,3)...) ) AS l(x,n) 
  ORDER BY x [DESC]
where the values to display in the header are bound to 
$1,$2,.. and the type associated with these parameters is
the PQftype() of the field from which these values come.
Then  the  values coming back ordered by  tell us
where to position the values corresponding to $1,$2... in the
sorted header.

There are some cases when this sort cannot work.
For example if the field is an anonymous type or a ROW().
Or if the field is POINT(x,y), because our "point" type
does not support order by.
I believe these are corner cases for this feature. In these
cases, psql just displays the error message that PQexecParams()
emits.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Pavel Stehule
Hi


> I just rechecked the thread.  In my reading, lots of people argued
> whether it should be called \rotate or \pivot or \crosstab; it seems the
> \crosstabview proposal was determined to be best.  I can support that
> decision.  But once the details were discussed, it was only you and
> Daniel left in the thread; nobody else participated.  While I understand
> that you may think that "silence is consent", what I am afraid of is
> that some committer will look at this two months from now and say "I
> hate this Hcol+ stuff, -1 from me" and send the patch back for syntax
> rework.  IMO it's better to have more people chime in here so that the
> patch that we discuss during the next commitfest is really the best one
> we can think of.
>

I have not a feeling so we did some with Daniel privately. All work was
public (I checked my mailbox) - but what is unhappy - in more mailing list
threads (not sure how it is possible, because subjects looks same). The
discus about the design was public, I am sure. It was relative longer
process, with good progress (from my perspective), because Daniel accepts
and fixed all my objection. The proposed syntax is fully consistent with
other psql commands - hard to create something new there, because psql
parser is pretty limited. Although I am thinking so syntax is good, clean
and useful I am open to discuss about it. Please, try the last design, last
patch - I spent lot of hours (and I am sure so Daniel much more) in
thinking how this can be designed better.


> Also, what about the business of putting "x" if there's no third column?
> Three months from now some Czech psql hacker will say "we should use
> Unicode chars for this" and we will be forever stuck with \pset
> unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
> CZECH.  Maybe we should think that a bit harder -- for example, what
> about just rejecting the case with no third column and forcing the user
> to add a third column with the character they choose?  That way you
> avoid that mess.
>

These features are in category "nice to have". There are no problem to do
in last commitfest or in next release cycle. It is not reason why to block
commit of this feature, and I am sure so lot of users can be pretty happy
with "basic" version of this patch. The all necessary functionality is
there and working. We can continue on development in other cycles, but for
this cycle, I am sure, so all necessary work is done.


>
> > This feature has only small relation to SQL PIVOTING feature - it is just
> > form of view and I am agree with Daniel about sense of this feature.
>
> Yes, I don't disagree there.  Robert Haas and David Fetter also
> expressed their support for psql-side processing, so I think we're good
> there.
>
>
great. Personally, I have not any objection against current state. If
anybody has, please do it early. We can move to forward. This is nice
feature, good patch, and there is not reason why stop here.

Regards

Pavel


> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Teodor Sigaev

Hi!

Interesting feature, but it's not very obvious how to use it. I'd like to see 
some example(s) in documentation.


And I see an implementation of AVL tree in psql source code 
(src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree 
implementation in
src/include/lib/rbtree.h and src/backend/lib/rbtree.c. Is any advantage of using 
AVL tree here? I have some doubt about that and this implementation, obviously, 
will not be well tested. But I see in comments that implementation is reduced to 
insert only and it doesn't use the fact of ordering tree, so, even hash could be 
used.


Daniel Verite wrote:

Pavel Stehule wrote:


1. maybe we can decrease name to shorter "crossview" ?? I am happy with
crosstabview too, just crossview is correct too, and shorter


I'm in favor of keeping crosstabview. It's more explicit, only
3 characters longer and we have tab completion anyway.


2. Columns used for ordering should not be displayed by default. I can live
with current behave, but hiding ordering columns is much more practical for
me


I can see why, but I'm concerned about a consequence:
say we have 4 columns A,B,C,D and user does \crosstabview +A:B +C:D
If B and D are excluded by default, then there's nothing
left to display inside the grid.
It doesn't feel quite right. There's something counter-intuitive
in the fact that values in the grid would disappear depending on
whether and how headers are sorted.
With the 3rd argument, we let the user decide what they want
to see.


3. This code is longer, so some regress tests are recommended - attached
simple test case


I've added a few regression tests to the psql testsuite
based on your sample data. New patch with these tests
included is attached, make check passes.

Best regards,






--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Daniel Verite
Teodor Sigaev wrote:

> Interesting feature, but it's not very obvious how to use it. I'd like to
> see  some example(s) in documentation.

I'm thinking of making a wiki page, because examples pretty much
require showing resultsets, and I'm not sure this would fly
with our current psql documentation, which is quite compact.

The current bit of doc I've produced is 53 lines long in manpage
format already. The text has not been proofread by a native
English speaker yet, so part of the problem might be that
it's just me struggling with english :)

> And I see an implementation of AVL tree in psql source code
> (src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree
> implementation in src/include/lib/rbtree.h and
> src/backend/lib/rbtree.c. Is any advantage of using AVL tree here? I
> have some doubt about that and this implementation, obviously, will
> not be well tested. But I see in comments that implementation is
> reduced to insert only and it doesn't use the fact of ordering tree,
> so, even hash could be used.

Yes. I expect too that a RB tree or a hash-based algorithm would do
the job and perform well.

The AVL implementation in crosstabview is purposely simplified
and specialized for this job, resulting in ~185 lines of code
versus ~850 lines for rb-tree.c
But I understand the argument that the existing rb-tree has been
battle-tested, whereas this code hasn't.

I'm looking at rb-tree.c and thinking what it would take to
incorporate it:
1. duplicating or linking from backend/lib/rb-tree.c into psql/
2. replacing the elog() calls with something else in the case of psql
3. updating the crosstabview data structures and call sites.

While I'm OK with #3, #1 and #2 seem wrong.
I could adapt rb-tree.c so that the same code can be used
both client-side and server-side, but touching server-side
code for this feature and creating links in the source tree
feels invasive and overkill.

Another approach is to replace AVL with an hash-based algorithm,
but that raises the same sort of question. If crosstabview comes
with its specific implementation, why use that rather than existing
server-side code? But at a glance, utils/hash/dynahash.c seems quite
hard to convert for client-side use.

I'm open to ideas on this. In particular, if we have a hash table
implementation that is already blessed by the project and small enough
to make sense in psql, I'd be happy to consider it.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Pavel Stehule
Hi



> FWIW I think the general idea of this feature (client-side resultset
> "pivoting") is a good one, but I don't really have an opinion regarding
> your specific proposal.  I think you should first seek some more
> consensus about the proposed design; in your original thread [1] several
> guys defended the idea of having this be a psql feature, and the idea of
> this being a parallel to \x seems a very sensible one, but there's
> really been no discussion on whether your proposed "+/-" syntax to
> change sort order makes sense, for one thing.
>

I am sorry, but I disagree - the discussion about implementation was more
than two months, and I believe so anybody who would to discuss had enough
time to discuss. This feature and design was changed significantly and
there was not anybody who sent feature design objection.

This feature has only small relation to SQL PIVOTING feature - it is just
form of view and I am agree with Daniel about sense of this feature.

Regards

Pavel


>
> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.
>
> I'm closing this as returned-with-feedback for now.
>
> [1] It's a good idea to add links to previous threads where things were
> discussed.  I had to search for
> www.postgresql.org/message-id/flat/78543039-c708-4f5d-a66f-0c0fbcda1f76@mm
> because you didn't provide a link to it when you started the second
> thread.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Alvaro Herrera
Pavel Stehule wrote:

> > FWIW I think the general idea of this feature (client-side resultset
> > "pivoting") is a good one, but I don't really have an opinion regarding
> > your specific proposal.  I think you should first seek some more
> > consensus about the proposed design; in your original thread [1] several
> > guys defended the idea of having this be a psql feature, and the idea of
> > this being a parallel to \x seems a very sensible one,

Sorry, I meant \q here, not \x.

> > but there's really been no discussion on whether your proposed "+/-"
> > syntax to change sort order makes sense, for one thing.
> 
> I am sorry, but I disagree - the discussion about implementation was more
> than two months, and I believe so anybody who would to discuss had enough
> time to discuss. This feature and design was changed significantly and
> there was not anybody who sent feature design objection.

I just rechecked the thread.  In my reading, lots of people argued
whether it should be called \rotate or \pivot or \crosstab; it seems the
\crosstabview proposal was determined to be best.  I can support that
decision.  But once the details were discussed, it was only you and
Daniel left in the thread; nobody else participated.  While I understand
that you may think that "silence is consent", what I am afraid of is
that some committer will look at this two months from now and say "I
hate this Hcol+ stuff, -1 from me" and send the patch back for syntax
rework.  IMO it's better to have more people chime in here so that the
patch that we discuss during the next commitfest is really the best one
we can think of.

Also, what about the business of putting "x" if there's no third column?
Three months from now some Czech psql hacker will say "we should use
Unicode chars for this" and we will be forever stuck with \pset
unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
CZECH.  Maybe we should think that a bit harder -- for example, what
about just rejecting the case with no third column and forcing the user
to add a third column with the character they choose?  That way you
avoid that mess.

> This feature has only small relation to SQL PIVOTING feature - it is just
> form of view and I am agree with Daniel about sense of this feature.

Yes, I don't disagree there.  Robert Haas and David Fetter also
expressed their support for psql-side processing, so I think we're good
there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Alvaro Herrera
Daniel Verite wrote:
>   Teodor Sigaev wrote:
> 
> > Interesting feature, but it's not very obvious how to use it. I'd like to
> > see  some example(s) in documentation.
> 
> I'm thinking of making a wiki page, because examples pretty much
> require showing resultsets, and I'm not sure this would fly
> with our current psql documentation, which is quite compact.

Yeah, we need to keep in mind that the psql doc is processed as a
manpage also, so it may not be a great idea to add too many things
there.  But I also agree that some good examples would be useful.

FWIW I think the general idea of this feature (client-side resultset
"pivoting") is a good one, but I don't really have an opinion regarding
your specific proposal.  I think you should first seek some more
consensus about the proposed design; in your original thread [1] several
guys defended the idea of having this be a psql feature, and the idea of
this being a parallel to \x seems a very sensible one, but there's
really been no discussion on whether your proposed "+/-" syntax to
change sort order makes sense, for one thing.

So please can we have that wiki page so that the syntax can be hammered
out a bit more.

I'm closing this as returned-with-feedback for now.

[1] It's a good idea to add links to previous threads where things were
discussed.  I had to search for
www.postgresql.org/message-id/flat/78543039-c708-4f5d-a66f-0c0fbcda1f76@mm
because you didn't provide a link to it when you started the second
thread.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-04 Thread Pavel Stehule
Hi

I tested last version, v11 and I have not any objection

It is working as expected

all regress tests passed, there is related documentation and new test is
attached.

This patch is ready form commiter.

Daniel, thank you very much, it is interesting feature.

Regards

Pavel


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-01 Thread Daniel Verite
Pavel Stehule wrote:

> 1. maybe we can decrease name to shorter "crossview" ?? I am happy with
> crosstabview too, just crossview is correct too, and shorter

I'm in favor of keeping crosstabview. It's more explicit, only 
3 characters longer and we have tab completion anyway.

> 2. Columns used for ordering should not be displayed by default. I can live
> with current behave, but hiding ordering columns is much more practical for
> me

I can see why, but I'm concerned about a consequence:
say we have 4 columns A,B,C,D and user does \crosstabview +A:B +C:D
If B and D are excluded by default, then there's nothing
left to display inside the grid.
It doesn't feel quite right. There's something counter-intuitive
in the fact that values in the grid would disappear depending on
whether and how headers are sorted.
With the 3rd argument, we let the user decide what they want
to see.

> 3. This code is longer, so some regress tests are recommended - attached
> simple test case

I've added a few regression tests to the psql testsuite
based on your sample data. New patch with these tests
included is attached, make check passes.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6d0cb3d..a242ec4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,123 @@ testdb=
   
 
   
+\crosstabview [
+[-|+]colV
+[:scolV]
+[-|+]colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header, optionally sorted by scolV,
+and the output column colH
+becomes a horizontal header, optionally sorted by
+scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colV,
+or in descending order if it's a minus (-) sign.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colH,
+or in descending order if it's a minus (-) sign.
+
+Also, they can optionally be sorted by another column, if
+colV
+(respectively colH) is
+immediately followed by a colon and a reference to another column
+scolV
+(respectively scolH).
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+ the case of one row except that the values coming from different rows
+ are stacked vertically, the different source rows being separated by
+ newline characters inside the cell.
+

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-01-25 Thread Pavel Stehule
Hi

2016-01-22 19:53 GMT+01:00 Daniel Verite :

>   Hi,
>
> Here's an updated patch improving on how the horizontal and vertical
> headers can be sorted.
>
> The discussion upthread went into how it was desirable
> to have independant sorts for these headers, possibly driven
> by another column, in addition to the query's ORDER BY.
>
> Thus the options now accepted are:
>
> \crosstabview [ [-|+]colV[:scolV] [-|+]colH[:scolH]  [colG1[,colG2...]] ]
>
> The optional scolV/scolH columns drive sorts for respectively
> colV/colH (colV:scolV somehow means SELECT colV from... order by scolV)
>
> colG1,... in 3rd arg indicate the columns whose contents form the grid
> cells, the typical use case being that there's only one such column.
> By default it's all columns minus colV and colH.
>
> For example,
>
> SELECT
>   cust_id,
>   cust_name,
>   cust_date,
>   date_part('month, sales_date),
>   to_char(sales_date, 'Mon') as month,
>   amount
> FROM sales_view
> WHERE [predicates]
> [ORDER BY ...]
>
> If we want to look at  in a grid with months names across, sorted
> by month number, and customer name in the vertical header, sorted by date
> of
> acquisition, we could do this:
>
> \crosstabview +cust_name:cust_date +5:4 amount
>
> or letting the vertical header being sorted by the query's ORDER BY,
> and the horizontal header same as above:
>
> \crosstabview cust_name +5:4 amount
>
> or sorting vertically by name, if it happens that the ORDER BY is missing
> or
> is on something else:
>
> \crosstabview +cust_name +5:4 amount
>

I am playing with this patch, and I have following comments:

1. maybe we can decrease name to shorter "crossview" ?? I am happy with
crosstabview too, just crossview is correct too, and shorter

2. Columns used for ordering should not be displayed by default. I can live
with current behave, but hiding ordering columns is much more practical for
me

3. This code is longer, so some regress tests are recommended - attached
simple test case

Regards

Pavel





>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


regresstest.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-01-22 Thread Daniel Verite
  Hi,

Here's an updated patch improving on how the horizontal and vertical
headers can be sorted.

The discussion upthread went into how it was desirable
to have independant sorts for these headers, possibly driven
by another column, in addition to the query's ORDER BY.

Thus the options now accepted are:

\crosstabview [ [-|+]colV[:scolV] [-|+]colH[:scolH]  [colG1[,colG2...]] ]

The optional scolV/scolH columns drive sorts for respectively
colV/colH (colV:scolV somehow means SELECT colV from... order by scolV)

colG1,... in 3rd arg indicate the columns whose contents form the grid
cells, the typical use case being that there's only one such column.
By default it's all columns minus colV and colH.

For example,

SELECT
  cust_id,
  cust_name,
  cust_date,
  date_part('month, sales_date),
  to_char(sales_date, 'Mon') as month,
  amount
FROM sales_view
WHERE [predicates]
[ORDER BY ...]

If we want to look at  in a grid with months names across, sorted
by month number, and customer name in the vertical header, sorted by date of 
acquisition, we could do this:

\crosstabview +cust_name:cust_date +5:4 amount

or letting the vertical header being sorted by the query's ORDER BY,
and the horizontal header same as above:

\crosstabview cust_name +5:4 amount

or sorting vertically by name, if it happens that the ORDER BY is missing or
is on something else:

\crosstabview +cust_name +5:4 amount


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6d0cb3d..a242ec4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,123 @@ testdb=
   
 
   
+\crosstabview [
+[-|+]colV
+[:scolV]
+[-|+]colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header, optionally sorted by scolV,
+and the output column colH
+becomes a horizontal header, optionally sorted by
+scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colV,
+or in descending order if it's a minus (-) sign.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colH,
+or in descending order if it's a minus (-) sign.
+
+Also, they can optionally be sorted by another column, if
+colV
+(respectively colH) is
+immediately followed by a colon and a reference to another column
+scolV
+(respectively scolH).
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2015-12-26 Thread Pavel Stehule
Hi

2015-12-23 21:36 GMT+01:00 Daniel Verite :

>Hi,
>
> Here's an updated patch that replaces sorted arrays by AVL binary trees
> when gathering distinct values for the columns involved in the pivot.
> The change is essential for large resultsets. For instance,
> it allows to process a query like this (10 million rows x 10 columns):
>
> select x,(random()*10)::int, (random()*1000)::int from
>   generate_series(1,1000) as x
>  \crosstabview
>
> which takes about 30 seconds to run and display on my machine with the
> attached patch. That puts it seemingly in the same ballpark than
> the equivalent test with the server-side crosstab().
>
> With the previous iterations of the patch, this test would never end,
> even with much smaller sets, as the execution time of the 1st step
> grew exponentially with the number of distinct keys.
> The exponential effect starts to be felt at about 10k values on my low-end
> CPU,
> and from there quickly becomes problematic.
>
> As a client-side display feature, processing millions of rows like in
> the query above does not necessarily make sense, it's pushing the
> envelope, but stalling way below 100k rows felt lame, so I'm happy to get
> rid of that limitation.
>
> However, there is another one. The above example does not need or request
> an additional sort step, but if it did, sorting more than 65535 entries in
> the vertical header would error out, because values are shipped as
> parameters to PQexecParams(), which only accepts that much.
> To avoid the problem, when the rows in the output "grid" exceed 2^16 and
> they need to be sorted, the user must  let the sort being driven by ORDER
> BY
> beforehand in the query, knowing that the pivot will keep the original
> ordering intact in the vertical header.
>
> I'm still thinking about extending this based on Pavel's diff for the
> "label" column, so that
>  \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH]
> would mean to use colV/H as grid headers but sort them according
> to colSortV/H.
> I prefer that syntax over adding more parameters, and also I'd like
> to have it work in both V and H directions.
>
> Aside from the AVL trees, there are a few other minor changes in that
> patch:
> - move non-exportable structs from the .h to the .c
> - move code in common.c to respect alphabetical ordering
> - if vertical sort is requested, add explicit check against more than 65535
>   params instead of letting the sort query fail
> - report all failure cases of the sort query
> - rename sortColumns to serverSort and use less the term "columns" in
>   comments and variables.
>
>

I checked this version and it is looking well.

* all regress tests passed
* patch is clean, well documented, well formatted
* no objection related to code
* current limits 65K * 1600 is good enough, I don't see it as limiting

I am looking for next version

Regards

Pavel




>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2015-12-25 Thread Pavel Stehule
2015-12-23 21:36 GMT+01:00 Daniel Verite :

>Hi,
>
> Here's an updated patch that replaces sorted arrays by AVL binary trees
> when gathering distinct values for the columns involved in the pivot.
> The change is essential for large resultsets. For instance,
> it allows to process a query like this (10 million rows x 10 columns):
>
> select x,(random()*10)::int, (random()*1000)::int from
>   generate_series(1,1000) as x
>  \crosstabview
>
> which takes about 30 seconds to run and display on my machine with the
> attached patch. That puts it seemingly in the same ballpark than
> the equivalent test with the server-side crosstab().
>
> With the previous iterations of the patch, this test would never end,
> even with much smaller sets, as the execution time of the 1st step
> grew exponentially with the number of distinct keys.
> The exponential effect starts to be felt at about 10k values on my low-end
> CPU,
> and from there quickly becomes problematic.
>
> As a client-side display feature, processing millions of rows like in
> the query above does not necessarily make sense, it's pushing the
> envelope, but stalling way below 100k rows felt lame, so I'm happy to get
> rid of that limitation.
>
> However, there is another one. The above example does not need or request
> an additional sort step, but if it did, sorting more than 65535 entries in
> the vertical header would error out, because values are shipped as
> parameters to PQexecParams(), which only accepts that much.
> To avoid the problem, when the rows in the output "grid" exceed 2^16 and
> they need to be sorted, the user must  let the sort being driven by ORDER
> BY
> beforehand in the query, knowing that the pivot will keep the original
> ordering intact in the vertical header.
>
> I'm still thinking about extending this based on Pavel's diff for the
> "label" column, so that
>  \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH]
> would mean to use colV/H as grid headers but sort them according
> to colSortV/H.
> I prefer that syntax over adding more parameters, and also I'd like
> to have it work in both V and H directions.
>

This syntax is good - simple, readable

Pavel

>
> Aside from the AVL trees, there are a few other minor changes in that
> patch:
> - move non-exportable structs from the .h to the .c
> - move code in common.c to respect alphabetical ordering
> - if vertical sort is requested, add explicit check against more than 65535
>   params instead of letting the sort query fail
> - report all failure cases of the sort query
> - rename sortColumns to serverSort and use less the term "columns" in
>   comments and variables.
>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2015-12-23 Thread Daniel Verite
   Hi,

Here's an updated patch that replaces sorted arrays by AVL binary trees
when gathering distinct values for the columns involved in the pivot.
The change is essential for large resultsets. For instance,
it allows to process a query like this (10 million rows x 10 columns):

select x,(random()*10)::int, (random()*1000)::int from
  generate_series(1,1000) as x
 \crosstabview

which takes about 30 seconds to run and display on my machine with the
attached patch. That puts it seemingly in the same ballpark than 
the equivalent test with the server-side crosstab().

With the previous iterations of the patch, this test would never end,
even with much smaller sets, as the execution time of the 1st step
grew exponentially with the number of distinct keys.
The exponential effect starts to be felt at about 10k values on my low-end
CPU,
and from there quickly becomes problematic.

As a client-side display feature, processing millions of rows like in
the query above does not necessarily make sense, it's pushing the
envelope, but stalling way below 100k rows felt lame, so I'm happy to get
rid of that limitation.

However, there is another one. The above example does not need or request
an additional sort step, but if it did, sorting more than 65535 entries in
the vertical header would error out, because values are shipped as
parameters to PQexecParams(), which only accepts that much.
To avoid the problem, when the rows in the output "grid" exceed 2^16 and
they need to be sorted, the user must  let the sort being driven by ORDER BY
beforehand in the query, knowing that the pivot will keep the original
ordering intact in the vertical header.

I'm still thinking about extending this based on Pavel's diff for the
"label" column, so that
 \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH] 
would mean to use colV/H as grid headers but sort them according
to colSortV/H.
I prefer that syntax over adding more parameters, and also I'd like
to have it work in both V and H directions. 

Aside from the AVL trees, there are a few other minor changes in that
patch:
- move non-exportable structs from the .h to the .c
- move code in common.c to respect alphabetical ordering
- if vertical sort is requested, add explicit check against more than 65535
  params instead of letting the sort query fail
- report all failure cases of the sort query
- rename sortColumns to serverSort and use less the term "columns" in
  comments and variables.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6d0cb3d..563324d 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2485,6 +2485,95 @@ lo_import 152801
 
   
 
+  
+\crosstabview [ [-|+]colV  [-|+]colH ] 
+
+
+Execute the current query buffer (like \g) and shows the results
+inside a crosstab grid. The output column colV
+becomes a vertical header and the output column
+colH becomes a horizontal header.
+The results for the other output columns are projected inside the grid.
+
+
+
+colV
+and colH can indicate a
+column position (starting at 1), or a column name. Normal case folding
+and quoting rules apply on column names. By default,
+colV is column 1
+and colH is column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results, or in ascending
+order if a plus (+) sign precedes colH,
+or in descending order if it's a minus (-) sign.
+
+
+
+The query results being tuples of N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+a cell located at the intersection (x,y) in the grid
+has contents determined by these rules:
+
+
+
+ if there is no corresponding row in the results such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns are displayed in the