Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > I pushed it. That's awesome, thanks! Also thanks to Pavel who reviewed and helped continuously when iterating on this feature, and all others who participed in this thread. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Daniel Verite wrote: > ISTM that this could be avoided by erroring out for lack of an > explicit 3rd column as argument. IOW, we wouldn't assume > that "no column specified" means "show all columns". > > About simply ripping out the possibility of having multiple > columns into cells, it's more radical but if that part turns out to > be more confusing than useful, I don't have a problem > with removing it. Okay, I've ripped that out since I wasn't comfortable with the general idea. Once you have two data values for the same cell, the new code raises an error, indicating the corresponding vertical and horizontal header values; that way it's easy to spot where the problem is. I also removed the FETCH_COUNT bits; it didn't make a lot of sense to me. Like \gexec, the query is executed to completion when in \crosstabview regardless of FETCH_COUNT. > The other case of stringing multiple contents into the same cell > is when different tuples carry (row,column) duplicates. > I'm not inclined to disallow that case, I think it would go too far > in guessing what the user expects. > My expectation for a viewer is that it displays the results as far as > possible, whatever they are. The reason I made this case throw an error is that we can tweak the behavior later. I think separating them with newlines is too cute and will be unusable when you have values that have embedded newlines; you can imitate that behavior with string_agg(val, E'\n') as I've done in the regression tests. One option for improving it would be to have it add another record, but that requires shifting the values of all cells by the number of columns (you can see that if you change the border options, or in HTML output etc). We can do that later. > Also, showing such contents in vertically-growing cells as it > does now allows the user to spot these easily in the grid when > they happen to be outliers. I'm seeing it as useful in that case. It's useful, no doubt. I pushed it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Robert Haas wrote: > This seems like it might be converging on some sort of consensus, but > I'm wondering if we shouldn't push it to 9.7, instead of rushing > decisions that we will later have trouble changing on > backward-compatibility grounds. My intention is to commit this afternoon in the next couple of hours, and only the most basic case is going to be supported, and the rest of the cases (concatenation of several fields and several rows, etc) are just going to throw errors; that way, it will be easy to add more features later as they are agreed upon. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On Fri, Apr 8, 2016 at 7:23 AM, Daniel Veritewrote: > Alvaro Herrera wrote: > >> I wonder if the business of appending values of multiple columns >> separated with spaces is doing us any good. Why not require that >> there's a single column in the cell? If the user wants to put things >> together, they can use format() or just || the fields together. What >> benefit is there to the ' '? When I ran my first test queries over >> pg_class I was surprised about this behavior: >> alvherre=# select * from pg_class >> alvherre=# \crosstabview relnatts relkind > > ISTM that this could be avoided by erroring out for lack of an > explicit 3rd column as argument. IOW, we wouldn't assume > that "no column specified" means "show all columns". > > About simply ripping out the possibility of having multiple > columns into cells, it's more radical but if that part turns out to > be more confusing than useful, I don't have a problem > with removing it. > > The other case of stringing multiple contents into the same cell > is when different tuples carry (row,column) duplicates. > I'm not inclined to disallow that case, I think it would go too far > in guessing what the user expects. > My expectation for a viewer is that it displays the results as far as > possible, whatever they are. > Also, showing such contents in vertically-growing cells as it > does now allows the user to spot these easily in the grid when > they happen to be outliers. I'm seeing it as useful in that case. This seems like it might be converging on some sort of consensus, but I'm wondering if we shouldn't push it to 9.7, instead of rushing decisions that we will later have trouble changing on backward-compatibility grounds. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > I wonder if the business of appending values of multiple columns > separated with spaces is doing us any good. Why not require that > there's a single column in the cell? If the user wants to put things > together, they can use format() or just || the fields together. What > benefit is there to the ' '? When I ran my first test queries over > pg_class I was surprised about this behavior: > alvherre=# select * from pg_class > alvherre=# \crosstabview relnatts relkind ISTM that this could be avoided by erroring out for lack of an explicit 3rd column as argument. IOW, we wouldn't assume that "no column specified" means "show all columns". About simply ripping out the possibility of having multiple columns into cells, it's more radical but if that part turns out to be more confusing than useful, I don't have a problem with removing it. The other case of stringing multiple contents into the same cell is when different tuples carry (row,column) duplicates. I'm not inclined to disallow that case, I think it would go too far in guessing what the user expects. My expectation for a viewer is that it displays the results as far as possible, whatever they are. Also, showing such contents in vertically-growing cells as it does now allows the user to spot these easily in the grid when they happen to be outliers. I'm seeing it as useful in that case. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > I messed with that code some more, as it looked unnecessarily > complicated; please see attached and verify that it still behaves > sanely. This needs those regression tests you promised. I tested a few > cases and it seems good to me. I've fixed a couple things over v16: - avoid passing every cell through psprintf, which happened due to cont.cells being pre-initialized to empty strings. - adjusted the loop freeing allocated_cells and added the regression tests. Attached is the diff over v16, tested with make check and valgrind. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c index 0d70e47..a20296e 100644 --- a/src/bin/psql/crosstabview.c +++ b/src/bin/psql/crosstabview.c @@ -360,7 +360,6 @@ printCrosstab(const PGresult *results, printQueryOpt popt = pset.popt; printTableContent cont; int i, -j, rn; char col_align; int *horiz_map; @@ -414,9 +413,6 @@ printCrosstab(const PGresult *results, cont.cells[k * (num_columns + 1)] = piv_rows[i].name ? piv_rows[i].name : (popt.nullPrint ? popt.nullPrint : ""); - /* Initialize all cells inside the grid to an empty value */ - for (j = 0; j < num_columns; j++) - cont.cells[k * (num_columns + 1) + j + 1] = ""; } cont.cellsadded = num_rows * (num_columns + 1); @@ -506,14 +502,10 @@ printCrosstab(const PGresult *results, * first column of each row, separate with a newline * instead. */ - if (allocated_cells[idx] != NULL) - new_content = psprintf("%s%s%s", - allocated_cells[idx], - i == 0 ? "\n" : " ", - content); - else - new_content = psprintf("%s", content); - + new_content = psprintf("%s%s%s", + cont.cells[idx], + i == 0 ? "\n" : " ", + content); cont.cells[idx] = new_content; if (allocated_cells[idx] != NULL) pg_free(allocated_cells[idx]); @@ -528,10 +520,20 @@ printCrosstab(const PGresult *results, } } + /* + * The non-initialized cells must be set to an empty string for the print + * functions + */ + for (i = 0; i < cont.cellsadded; i++) + { + if (cont.cells[i] == NULL) + cont.cells[i] = ""; + } + printTable(, pset.queryFout, false, pset.logfile); printTableCleanup(); - for (i = 0; i < num_rows * num_columns; i++) + for (i = 0; i < (num_rows + 1) * (num_columns + 1); i++) { if (allocated_cells[i] != NULL) pg_free(allocated_cells[i]); diff --git a/src/test/regress/expected/psql_crosstabview.out b/src/test/regress/expected/psql_crosstabview.out new file mode 100644 index 000..df3824a --- /dev/null +++ b/src/test/regress/expected/psql_crosstabview.out @@ -0,0 +1,158 @@ +-- +-- tests for \crosstabview +-- +CREATE VIEW vct_data as +select * from ( values + ('v1','h2','foo', 3, '2015-04-01'::date), + ('v2','h1','bar', 3, '2015-01-02'), + ('v1','h0','baz', NULL, '2015-07-12'), + ('v0','h4','qux', 4, '2015-07-15'), + ('v0','h4','dbl', -3, '2014-12-15'), + ('v0',NULL,'qux', 5, '2014-03-15') + ) as l(v,h,c,i,d); +-- 2 columns with implicit 'X' as 3rd column +select v,i from vct_data order by 1,2 \crosstabview v i + v | -3 | 4 | 5 | 3 | +++---+---+---+--- + v0 | X | X | X | | + v1 || | | X | X + v2 || | | X | +(3 rows) + +-- basic usage with 3 columns +select v, extract(year from d),count(*) from vct_data + group by 1, 2 order by 1,2 + \crosstabview + v | 2014 | 2015 ++--+-- + v0 |2 |1 + v1 | |2 + v2 | |1 +(3 rows) + +-- ordered months in horizontal header, enclosed column name +select v, to_char(d,'Mon') as "month name", extract(month from d) as num, + count(*) from vct_data group by 1,2,3 order by 1 + \crosstabview v "month name":num 4 + v | Jan | Mar | Apr | Jul | Dec ++-+-+-+-+- + v0 | | 1 | | 1 | 1 + v1 | | | 1 | 1 | + v2 | 1 | | | | +(3 rows) + +-- combine contents vertically into the same cell (V/H duplicates) +select v,h,c from vct_data order by 1,2,3 + \crosstabview 1 2 3 + v | h4 | | h0 | h2 | h1 ++-+-+-+-+- + v0 | dbl+| qux | | | +| qux | | | | + v1 | | | baz | foo | + v2 | | | | | bar +(3 rows) + +-- horizontal ASC order from window function +select v,h,c, row_number() over(order by h) as r from vct_data order by 1,3,2 + \crosstabview v h:r c + v | h0 | h1 | h2 | h4 | ++-+-+-+-+- + v0 | | | | dbl+| qux +| | | | qux | + v1 | baz | | foo | | + v2 | | bar | | | +(3 rows) + +-- horizontal DESC order from window function +select v,h,c, row_number() over(order by h DESC) as r from vct_data order by 1,3,2 + \crosstabview v h:r c + v | | h4 |
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Daniel Verite wrote: > > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')" > > block (line 497 in the attached), can't we do the same thing by using > > psprintf? > > In that block, we can't pass a cell contents as a valist and be done with > that cell, because duplicates of (col value,row value) may happen > at any iteration of the upper loop over PQntuples(results). Any cell really > may need reallocation unpredictably until that loop is done, whereas > psprintf starts by allocating a new buffer unconditionally, so it doesn't > look > to me like it could help to simplify that block. I messed with that code some more, as it looked unnecessarily complicated; please see attached and verify that it still behaves sanely. This needs those regression tests you promised. I tested a few cases and it seems good to me. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index d8b9a03..9c5a915 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -990,6 +990,113 @@ testdb= +\crosstabview [ +colV +colH +[:scolH] +[colG1[,colG2...]] +] + + +Execute the current query buffer (like \g) and shows +the results inside a crosstab grid. +The output column colV +becomes a vertical header +and the output column colH +becomes a horizontal header, optionally sorted by ranking data obtained +from scolH. + +colG1[,colG2...] +is the list of output columns to project into the grid. +By default, all output columns of the query except +colV and +colH +are included in this list. + + + +All columns can be refered to by their position (starting at 1), or by +their name. Normal case folding and quoting rules apply on column +names. By default, +colV corresponds to column 1 +and colH to column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the deduplicated values found in +column colV, in the same +order as in the query results. + + +The horizontal header, displayed as the first row, +contains the deduplicated values found in +column colH, in +the order of appearance in the query results. +If specified, the optional scolH +argument refers to a column whose values should be integer numbers +by which colH will be sorted +to be positioned in the horizontal header. + + + +Inside the crosstab grid, +given a query output with N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +the contents of a cell located at the intersection +(x,y) is determined by these rules: + + + + if there is no corresponding row in the query results such that the + value for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other + columns or the columns listed in + colG1[,colG2...] + are displayed in the cell, separated between each other by + a space character if needed. + + If N=2, the letter X is displayed + in the cell as if a virtual third column contained that character. + + + + + + if there are several corresponding rows, the behavior is identical to + the case of one row except that the values coming from different rows + are stacked vertically, the different source rows being separated by + newline characters inside the cell. + + + + + + + + + + \d[S+] [ pattern ] @@ -4066,6 +4173,47 @@ first | 4 second | four + + When suitable, query results can be shown in a crosstab representation + with the \crosstabview command: + +testdb= SELECT first, second, first 2 AS gt2 FROM my_table; + first | second | ge2 +---++- + 1 | one| f + 2 | two| f + 3 | three | t + 4 | four | t +(4 rows) + +testdb= \crosstabview first second + first | one | two | three | four +---+-+-+---+-- + 1 | f | | | + 2 | | f | | + 3 | |
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On Thu, Apr 7, 2016 at 1:26 PM, Alvaro Herrerawrote: > I wonder if the business of appending values of multiple columns > separated with spaces is doing us any good. Why not require that > there's a single column in the cell? If the user wants to put things > together, they can use format() or just || the fields together. What > benefit is there to the ' '? When I ran my first test queries over > pg_class I was surprised about this behavior: > > alvherre=# select * from pg_class > alvherre=# \crosstabview relnatts relkind > > relnatts | >r > | t > | > i | >v > > --+++--+--- >26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f > f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null) > | > | > | >30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n > 540 1 {=r/alvherre} (null) > | > | > | > 3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f > f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null) >+| pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t > n 540 1 (null) (null)+| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5 > 688 0 0 f f p 0 f f f f f f f t n 0 0 (null) (null)+| > pg_group 11 11661 0 10 0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0 > {=r/alvherre} (null) > + > > > I'm tempted to rip that out, unless you have a reason not to. > > In fact, I think even the grouping of values of multiple rows with \n is > not terribly great either. Why not just require people to group the > values beforehand? You can use "string_agg(column, E'\n')" to get the > same behavior, plus you can do other things such as sum() etc. > Went and looked at the examples page and at first blush it seems like this module only understands text. My specific concern here is dealing with "numbers-as-text" sorting. As to the question of behavior when multiple columns (and rows?) are present: we need some sort of default do we not. Nothing is precluding the user from doing their own aggregates and limiting the select-list. That said I'm more inclined to error if the input data in not unique on (v,h). I feel the possibility of a user query bug going unnoticed in that scenario is reasonably large since its likely that only some combinations of duplicates appear. I'm a bit less tentative regarding column concatenation since I would expect that nearly every cell involved in the output would be noticeably affected. Though, if we are going to protect against extra rows extending that to protect against extra columns seems fair. Another option is, possibly conditioned on the first two columns being the headers, to only take the column in the third position (or, the first unassigned column).and display it. Otherwise if multiple candidate columns are present and none are chosen for the cell we could just error and force the user to explicitly choose. The concatenation behavior seems like the least useful default. I'm inclined to favor the first unassigned input column. And never allow (v,h) is violate uniqueness. David J.
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > Daniel Verite wrote: > > > * A few examples in docs. The psql manpage should have at least two new > > > examples showing the crosstab features, one with the simplest case you > > > can think of, and another one showing all the features. > > > > Added that in the EXAMPLES section at the very end of the manpage. > > Ok. Seems a bit too short to me, and I don't like the fact that you > can't actually run it because you need to create the my_table > beforehand. I think it'd be better if you used a VALUES clause there, > so that the reader can cut'n paste into psql to start to play with the > feature. Oh, I noticed now that my_table was created by previous examples. Nevermind. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
I wonder if the business of appending values of multiple columns separated with spaces is doing us any good. Why not require that there's a single column in the cell? If the user wants to put things together, they can use format() or just || the fields together. What benefit is there to the ' '? When I ran my first test queries over pg_class I was surprised about this behavior: alvherre=# select * from pg_class alvherre=# \crosstabview relnatts relkind relnatts | r| t |i | v --+++--+--- 26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null) | | | 30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n 540 1 {=r/alvherre} (null)| | | 3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null) +| pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t n 540 1 (null) (null)+| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5 688 0 0 f f p 0 f f f f f f f t n 0 0 (null) (null)+| pg_group 11 11661 0 10 0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0 {=r/alvherre} (null) + I'm tempted to rip that out, unless you have a reason not to. In fact, I think even the grouping of values of multiple rows with \n is not terribly great either. Why not just require people to group the values beforehand? You can use "string_agg(column, E'\n')" to get the same behavior, plus you can do other things such as sum() etc. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Daniel Verite wrote: > > regression=# select * from pg_class \crosstabview relnatts > > \crosstabview: missing second argument > > regression-# > > Fixed. This was modelled after the behavior of: > select 1 \badcommand > but I've changed to mimic what happens with: > select 1 \g /some/invalid/path > the query buffer is not discarded by the error but the prompt > is ready for a fresh new command. Works for me. > > * A few examples in docs. The psql manpage should have at least two new > > examples showing the crosstab features, one with the simplest case you > > can think of, and another one showing all the features. > > Added that in the EXAMPLES section at the very end of the manpage. Ok. Seems a bit too short to me, and I don't like the fact that you can't actually run it because you need to create the my_table beforehand. I think it'd be better if you used a VALUES clause there, so that the reader can cut'n paste into psql to start to play with the feature. > > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')" > > block (line 497 in the attached), can't we do the same thing by using > > psprintf? > > In that block, we can't pass a cell contents as a valist and be done with > that cell, because duplicates of (col value,row value) may happen > at any iteration of the upper loop over PQntuples(results). Any cell really > may need reallocation unpredictably until that loop is done, whereas > psprintf starts by allocating a new buffer unconditionally, so it doesn't > look to me like it could help to simplify that block. I don't know what you mean, but here's what I meant. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c index b3510b9..0216dae 100644 --- a/src/bin/psql/crosstabview.c +++ b/src/bin/psql/crosstabview.c @@ -496,12 +496,12 @@ printCrosstab(const PGresult *results, int num_columns, { src_col = colsG[i]; - content = (!PQgetisnull(results, rn, src_col)) ? + content = !PQgetisnull(results, rn, src_col) ? PQgetvalue(results, rn, src_col) : (popt.nullPrint ? popt.nullPrint : ""); } -if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0') +if (cont.cells[idx] != NULL) { /* * Multiple values for the same (row,col) are projected @@ -509,12 +509,9 @@ printCrosstab(const PGresult *results, int num_columns, * previous content of the cell from the new value by a * newline. */ - int content_size; char *new_content; int idx2; - content_size = strlen(cont.cells[idx]) + 2 + strlen(content) + 1; - /* * idx2 is an index into allocated_cells. It differs from * idx (index into cont.cells), because vertical and @@ -524,34 +521,17 @@ printCrosstab(const PGresult *results, int num_columns, idx2 = (row_number * num_columns) + col_number; if (allocated_cells[idx2] != NULL) - { - new_content = pg_realloc(allocated_cells[idx2], content_size); - } + new_content = psprintf("%s%s%s", + allocated_cells[idx2], + i == 0 ? "\n" : " ", + content); else - { - /* - * At this point, cont.cells[idx] still contains a - * PQgetvalue() pointer. Just after, it will contain - * a new pointer maintained in allocated_cells[], and - * freed at the end of this function. - */ - new_content = pg_malloc(content_size); - strcpy(new_content, cont.cells[idx]); - } - cont.cells[idx] = new_content; - allocated_cells[idx2] = new_content; + new_content = psprintf("%s", content); - /* - * Contents that are on adjacent columns in the source - * results get separated by one space in the target. - * Contents that are on different rows in the source get - * separated by newlines in the target. - */ - if (i == 0) - strcat(new_content, "\n"); - else - strcat(new_content, " "); - strcat(new_content, content); + cont.cells[idx] = new_content; + if (allocated_cells[idx2] != NULL) + pg_free(allocated_cells[idx2]); + allocated_cells[idx2] = new_content; } else { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: Thanks for looking into that patch! > regression=# select * from pg_class \crosstabview relnatts > \crosstabview: missing second argument > regression-# Fixed. This was modelled after the behavior of: select 1 \badcommand but I've changed to mimic what happens with: select 1 \g /some/invalid/path the query buffer is not discarded by the error but the prompt is ready for a fresh new command. > alvherre=# select * from pg_class \crosstabview relnatts relkinda > Invalid column name: relkinda > alvherre=# select 1; > The query must return at least two columns to be shown in crosstab Definitely a bug. Fixed. Also fixed a one-off bug with quoted columns: in parseColumnRefs(), first call to PQmblen(), I wrongly assumed that PQmblen("", ..) returns 0, whereas in fact it returns 1. > * A few examples in docs. The psql manpage should have at least two new > examples showing the crosstab features, one with the simplest case you > can think of, and another one showing all the features. Added that in the EXAMPLES section at the very end of the manpage. > * Add regression test cases somewhere for the regression database. > Probably use "FROM tenk1 WHERE hundred < 5", which provides you with 500 > rows, enough for many interesting games. Make sure to test all the > provided features. I would use a new psql.sql file for this. Looking into regression tests, not yet done. > * How did you come up with the 1600 value? Whatever it is, please use a > #define instead of hardcoding it. Done with accompanying comment in crosstabview.h > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')" > block (line 497 in the attached), can't we do the same thing by using > psprintf? In that block, we can't pass a cell contents as a valist and be done with that cell, because duplicates of (col value,row value) may happen at any iteration of the upper loop over PQntuples(results). Any cell really may need reallocation unpredictably until that loop is done, whereas psprintf starts by allocating a new buffer unconditionally, so it doesn't look to me like it could help to simplify that block. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index d8b9a03..1072733 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -990,6 +990,113 @@ testdb= +\crosstabview [ +colV +colH +[:scolH] +[colG1[,colG2...]] +] + + +Execute the current query buffer (like \g) and shows +the results inside a crosstab grid. +The output column colV +becomes a vertical header +and the output column colH +becomes a horizontal header, optionally sorted by ranking data obtained +from scolH. + +colG1[,colG2...] +is the list of output columns to project into the grid. +By default, all output columns of the query except +colV and +colH +are included in this list. + + + +All columns can be refered to by their position (starting at 1), or by +their name. Normal case folding and quoting rules apply on column +names. By default, +colV corresponds to column 1 +and colH to column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the deduplicated values found in +column colV, in the same +order as in the query results. + + +The horizontal header, displayed as the first row, +contains the deduplicated values found in +column colH, in +the order of appearance in the query results. +If specified, the optional scolH +argument refers to a column whose values should be integer numbers +by which colH will be sorted +to be positioned in the horizontal header. + + + +Inside the crosstab grid, +given a query output with N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +the contents of a cell located at the intersection +(x,y) is determined by these rules: + + + + if there is no corresponding row in the query results such that the + value for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other +
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
I've been looking at this patch. First thing was to rebase on top of recent psql code restructuring; second, pgindent; third, reordered the code in crosstabview.c more sensibly (had to add prototypes). New version attached. Then I looked at the docs to try to figure out exactly how it works. I'm surprised that there's not a single example added to the psql manpage. Please add one. I then tested it a bit, "kick the tires" so to speak. I noticed that error handling is broken. For instance, observe the query prompt after the error: regression=# select * from pg_class \crosstabview relnatts \crosstabview: missing second argument regression-# At this point the query buffer contains the query (you can see it with \e), which seems bogus to me. The query buffer needs to be reset. Compare \gexec: alvherre=# select 1 \gexec ERROR: error de sintaxis en o cerca de «1» LÍNEA 1: 1 ^ alvherre=# Also, using bogus column names as arguments cause state to get all bogus: alvherre=# select * from pg_class \crosstabview relnatts relkinda Invalid column name: relkinda alvherre=# select 1; The query must return at least two columns to be shown in crosstab Note that the second query is not crosstab at all, yet the error message is entirely bogus. This one is probably the same bug: alvherre=# select 'one', 'two'; Invalid column name: relnatts Apparently, once in that state, not even a successful query crosstab display resets the state correctly: alvherre=# select * from pg_class \crosstabview relnatts relkinda Invalid column name: relkinda alvherre=# select 'one' as relnatts, 'two' as relkinda \crosstabview relnatts | two --+- one | X (1 fila) alvherre=# select 1; The query must return at least two columns to be shown in crosstab Please fix this. Some additional items: * A few examples in docs. The psql manpage should have at least two new examples showing the crosstab features, one with the simplest case you can think of, and another one showing all the features. * Add regression test cases somewhere for the regression database. Probably use "FROM tenk1 WHERE hundred < 5", which provides you with 500 rows, enough for many interesting games. Make sure to test all the provided features. I would use a new psql.sql file for this. * How did you come up with the 1600 value? Whatever it is, please use a #define instead of hardcoding it. * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')" block (line 497 in the attached), can't we do the same thing by using psprintf? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index d8b9a03..536141c 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -990,6 +990,113 @@ testdb= +\crosstabview [ +colV +colH +[:scolH] +[colG1[,colG2...]] +] + + +Execute the current query buffer (like \g) and shows +the results inside a crosstab grid. +The output column colV +becomes a vertical header +and the output column colH +becomes a horizontal header, optionally sorted by ranking data obtained +from scolH. + +colG1[,colG2...] +is the list of output columns to project into the grid. +By default, all output columns of the query except +colV and +colH +are included in this list. + + + +All columns can be refered to by their position (starting at 1), or by +their name. Normal case folding and quoting rules apply on column +names. By default, +colV corresponds to column 1 +and colH to column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the deduplicated values found in +column colV, in the same +order as in the query results. + + +The horizontal header, displayed as the first row, +contains the deduplicated values found in +column colH, in +the order of appearance in the query results. +If specified, the optional scolH +argument refers to a column whose values should be integer numbers +by which colH will be sorted +to be positioned in the horizontal header. + + + +Inside the crosstab grid, +given a query output with N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +the contents of a cell located at the intersection +(x,y) is determined by these rules: +
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Robert Haas wrote: > On Sun, Mar 20, 2016 at 5:27 PM, Pavel Stehule> wrote: > > From my perspective, it is ready for commiter. Daniel solved the most big > > issues. > > OK, so that brings us back to: is there any committer who likes this > enough to want to look at committing it? My view hasn't changed much > since > http://www.postgresql.org/message-id/ca+tgmoz4yaduq9j8xtgrbh868jh2nj_nw_qgkxb32cedsvt...@mail.gmail.com I volunteer for that, but it'll be a few days so if anyone else is interested, feel free. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On Sun, Mar 20, 2016 at 5:27 PM, Pavel Stehulewrote: > From my perspective, it is ready for commiter. Daniel solved the most big > issues. OK, so that brings us back to: is there any committer who likes this enough to want to look at committing it? My view hasn't changed much since http://www.postgresql.org/message-id/ca+tgmoz4yaduq9j8xtgrbh868jh2nj_nw_qgkxb32cedsvt...@mail.gmail.com -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi 2016-03-19 16:31 GMT+01:00 Alvaro Herrera: > Pavel Stehule wrote: > > > Can I do review? > > Of course. > I did review of last patch. I had to do small changes to run the code due last Tom's changes in psql. Updated patch is attached. The last changes in this patch are two: 1. Remove strange server side sorting 2. Cleaning/reducing interface Other code is +/- without changes. There was lot of discussion in this thread, I would not to repeat it. I'll comment the changes: @1 using server side sorting was really generic, but strange. Now, the crosstabview works without it without any significant functionality degradation. @2 interface is minimalist - but good enough - I am thinking so it is good start point. I was able to run my examples without problems. The previous API was more comfortable - "+","-" symbols allows to specify order quickly, but without a agreement we can live without this feature. Now, a order of data is controlled fully by SQL. crosstabview does data visualization only. I have not any objection to this last design. It is reduced to minimum, but still it works well. * All regress tests passed * A code is well and well commented * No new warnings or compilation issues * Documentation is clean I have two minor notes, can be fixed simply, if we accept this last design: 1. can be nice if documentation will contains one example 2. some regress tests >From my perspective, it is ready for commiter. Daniel solved the most big issues. Regards Pavel > -- > Álvaro Herrerahttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml new file mode 100644 index 8a85804..da0621b *** a/doc/src/sgml/ref/psql-ref.sgml --- b/doc/src/sgml/ref/psql-ref.sgml *** testdb= *** 990,995 --- 990,1102 + \crosstabview [ + colV + colH + [:scolH] + [colG1[,colG2...]] + ] + + + Execute the current query buffer (like \g) and shows + the results inside a crosstab grid. + The output column colV + becomes a vertical header + and the output column colH + becomes a horizontal header, optionally sorted by ranking data obtained + from scolH. + + colG1[,colG2...] + is the list of output columns to project into the grid. + By default, all output columns of the query except + colV and + colH + are included in this list. + + + + All columns can be refered to by their position (starting at 1), or by + their name. Normal case folding and quoting rules apply on column + names. By default, + colV corresponds to column 1 + and colH to column 2. + A query having only one output column cannot be viewed in crosstab, and + colH must differ from + colV. + + + + The vertical header, displayed as the leftmost column, + contains the deduplicated values found in + column colV, in the same + order as in the query results. + + + The horizontal header, displayed as the first row, + contains the deduplicated values found in + column colH, in + the order of appearance in the query results. + If specified, the optional scolH + argument refers to a column whose values should be integer numbers + by which colH will be sorted + to be positioned in the horizontal header. + + + + Inside the crosstab grid, + given a query output with N columns + (including colV and + colH), + for each distinct value x of + colH + and each distinct value y of + colV, + the contents of a cell located at the intersection + (x,y) is determined by these rules: + + + + if there is no corresponding row in the query results such that the + value for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other + columns or the columns listed in + colG1[,colG2...] + are displayed in the cell, separated between each other by + a space character if needed. + + If N=2, the letter X is displayed + in the cell as if a virtual third column contained that character. + + + + + + if there are several corresponding rows, the behavior is identical to + the case of one row except that
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Pavel Stehule wrote: > Can I do review? Of course. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
2016-03-19 15:45 GMT+01:00 Robert Haas: > On Mon, Mar 14, 2016 at 2:55 PM, Robert Haas > wrote: > > On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite > wrote: > >>> But worse than either of those things, there is no real > >>> agreement on what the overall design of this feature > >>> should be. > >> > >> The part in the design that raised concerns upthread is > >> essentially how headers sorting is exposed to the user and > >> implemented. > >> > >> As suggested in [1], I've made some drastic changes in the > >> attached patch to take the comments (from Dean R., Tom L.) > >> into account. > >> [ ... lengthy explanation ... ] > >> - also NULLs are no longer excluded from headers, per Peter E. > >> comment in [2]. > > > > Dean, Tom, Peter, what do you think of the new version? > > Is anyone up for re-reviewing this? If not, I think we're going to > have to reject this for lack of interest. > Can I do review? Pavel > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On Mon, Mar 14, 2016 at 2:55 PM, Robert Haaswrote: > On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite > wrote: >>> But worse than either of those things, there is no real >>> agreement on what the overall design of this feature >>> should be. >> >> The part in the design that raised concerns upthread is >> essentially how headers sorting is exposed to the user and >> implemented. >> >> As suggested in [1], I've made some drastic changes in the >> attached patch to take the comments (from Dean R., Tom L.) >> into account. >> [ ... lengthy explanation ... ] >> - also NULLs are no longer excluded from headers, per Peter E. >> comment in [2]. > > Dean, Tom, Peter, what do you think of the new version? Is anyone up for re-reviewing this? If not, I think we're going to have to reject this for lack of interest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On Sat, Mar 12, 2016 at 10:34 AM, Daniel Veritewrote: >> But worse than either of those things, there is no real >> agreement on what the overall design of this feature >> should be. > > The part in the design that raised concerns upthread is > essentially how headers sorting is exposed to the user and > implemented. > > As suggested in [1], I've made some drastic changes in the > attached patch to take the comments (from Dean R., Tom L.) > into account. > [ ... lengthy explanation ... ] > - also NULLs are no longer excluded from headers, per Peter E. > comment in [2]. Dean, Tom, Peter, what do you think of the new version? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Jim Nasby wrote: > Ultimately I'd really like some way to remove/reduce the restriction of > result set definitions needing to be determined at plan time. That would > open the door for server-side crosstab/pivot as well a a host of other > things (such as dynamically turning a hstore/json/xml field into a > recordset). > Ultimately I'd really like some way to remove/reduce the restriction of > result set definitions needing to be determined at plan time. That would > open the door for server-side crosstab/pivot as well a a host of other > things (such as dynamically turning a hstore/json/xml field into a > recordset). That would go against a basic expectation of prepared statements, the fact that queries can be parsed/prepared without any part of them being executed. For a dynamic pivot, but probably also for the other examples you have in mind, the SQL engine wouldn't be able to determine the output columns without executing a least a subselect to look inside some table(s). I suspect that the implications of this would be so far reaching and problematic that it will just not happen. It seems to me that a dynamic pivot will always consist of two SQL queries that can never be combined into one, unless using a workaround à la Oracle, which encapsulates the entire dynamic resultset into an XML blob as output. The problem here being that the client-side tools that people routinely use are not equipped to process it anyway; at least that's what I find by anecdotal evidence for instance in: https://community.oracle.com/thread/2133154?tstart=0 or http://stackoverflow.com/questions/19298424 or https://community.oracle.com/thread/2388982?tstart=0 Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Jim Nasby wrote: > On 3/13/16 12:48 AM, Pavel Stehule wrote: > >crosstabview is really visualization tool. **But now, there are not any > >other tool available from terminal.** So this can be significant help to > >all people who would to use this functionality. > > Not just the terminal either. Offhand I'm not aware of *any* fairly simple > tool that provides crosstab. There's a bunch of complicated/expensive BI > tools that do, but unless you've gone through the trouble of getting one of > those setup you're currently pretty stuck. I'm definitely +1 for this feature in psql also. Some years ago we had a discussion about splitting psql in two parts, a bare-bones one which would help script-writing and another one with fancy features; we decided to keep one tool to rule them all and made the implicit decision that we would grow exotic, sophisticated features into psql. ISTM that this patch is going in that direction. > Ultimately I'd really like some way to remove/reduce the restriction of > result set definitions needing to be determined at plan time. That would > open the door for server-side crosstab/pivot as well a a host of other > things (such as dynamically turning a hstore/json/xml field into a > recordset). That seems so far down the road that I don't think it should block the psql feature being proposed in this thread, but yes I would like that one too. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 3/13/16 12:48 AM, Pavel Stehule wrote: crosstabview is really visualization tool. **But now, there are not any other tool available from terminal.** So this can be significant help to all people who would to use this functionality. Not just the terminal either. Offhand I'm not aware of *any* fairly simple tool that provides crosstab. There's a bunch of complicated/expensive BI tools that do, but unless you've gone through the trouble of getting one of those setup you're currently pretty stuck. Ultimately I'd really like some way to remove/reduce the restriction of result set definitions needing to be determined at plan time. That would open the door for server-side crosstab/pivot as well a a host of other things (such as dynamically turning a hstore/json/xml field into a recordset). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
2016-03-11 14:49 GMT+01:00 Robert Haas: > On Thu, Feb 18, 2016 at 9:23 AM, Daniel Verite > wrote: > > Dean Rasheed wrote: > > > >> If I want to sort the rows coming out of a query, my first thought > >> is always going to be to add/adjust the query's ORDER BY clause, not > >> use some weird +/- psql syntax. > > > > About the vertical sort, I agree on all your points. > > It's best to rely on ORDER BY for all the reasons mentioned, > > as opposed to a separate sort in a second step. > > > > But you're considering the case when a user is designing > > or adapting a query for the purpose of crosstab > > viewing. As mentioned in my previous reply (about the > > methods to achieve horizontal sort), that scenario is not really > > what motivates the feature in the first place. > > > > If removing that sort option is required to move forward > > with the patch because it's controversial, so be it, > > but overall I don't see this as a benefit for the end user, > > it's just an option. > > Discussion on this patch seems to have died off. I'm probably not > going to win any popularity contests for saying this, but I think we > should reject this patch. I don't feel like this is really a psql > feature: it's a powerful data visualization tool which we're proposing > to jam into psql. I don't think that's psql's purpose. I also think > it's quite possible that there could be an unbounded number of > slightly different things that people want here, and if we take this > one and a couple more, the code for these individual features could > come to be larger than all of psql, even though probably 95% of psql > users would never use any of those. > crosstabview is really visualization tool. **But now, there are not any other tool available from terminal.** So this can be significant help to all people who would to use this functionality. The psql has lot of features for 5% users. Currently it is famous not as "bloated software" but like most comfortable sql console on the world. The implementation of crosstabview is not complex and with last Daniel's modification the complexity is less. The crosstabview is not 100% equal to ANSI SQL PIVOT clause. The ANSI SQL command is much more rigid (it is one stage statement with predefined columns), so argument of duplicate implementation one things is not valid. Probably we would not implement non ANSI SQL feature on server. Regards Pavel > > Now, that having been said, if other people want this feature to go in > and are willing to do the work to get it in, I've said my piece and > won't complain further. There are a couple of committers who have > taken positive interest in this thread, so that's good. However, > there are also a couple of committers who have expressed doubts > similar to mine, so that's not so good. But worse than either of > those things, there is no real agreement on what the overall design of > this feature should be. Everybody wants something a little different, > for different reasons. If we can't come to an agreement, more or less > immediately, on what to try to get into 9.6, then this can't go into > this release. Whether it should go into a future release is a > question we can leave for another time. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Robert Haas wrote: > But worse than either of those things, there is no real > agreement on what the overall design of this feature > should be. The part in the design that raised concerns upthread is essentially how headers sorting is exposed to the user and implemented. As suggested in [1], I've made some drastic changes in the attached patch to take the comments (from Dean R., Tom L.) into account. The idea is to limit to the bare minimum the involvement of psql in sorting: - the +/- syntax goes away - the possibility of post-sorting the values through a backdoor query goes away too, for both headers. - the vertical order of the crosstab view is now driven solely by the order in the query - the order of the horizontal header can be optionally specified by a column expected to contain an integer, with the syntax \crosstabview colv colh:scolh [other cols] which means "colh" will be sorted by "scolh". It still defaults to whatever order "colh" comes in from the results Concerning the optional "scolh", there are cases where it might pre-exist naturally, such as a month number going in pair with a month name. In other cases, a user may add it as a kind of "synthetic column" by way of a window function, for example: SELECT ...other columns..., (row_number() over(order by something [order options]) as scolh FROM... Only the relative order of scolh values is taken into account, the value itself has no meaning for crosstabview. - also NULLs are no longer excluded from headers, per Peter E. comment in [2]. [1] http://www.postgresql.org/message-id/3d513263-104b-41e3-b1c7-4ad4bd99c491@mm [2] http://www.postgresql.org/message-id/56c4e344.6070...@gmx.net Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 8a85804..da0621b 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -990,6 +990,113 @@ testdb= +\crosstabview [ +colV +colH +[:scolH] +[colG1[,colG2...]] +] + + +Execute the current query buffer (like \g) and shows +the results inside a crosstab grid. +The output column colV +becomes a vertical header +and the output column colH +becomes a horizontal header, optionally sorted by ranking data obtained +from scolH. + +colG1[,colG2...] +is the list of output columns to project into the grid. +By default, all output columns of the query except +colV and +colH +are included in this list. + + + +All columns can be refered to by their position (starting at 1), or by +their name. Normal case folding and quoting rules apply on column +names. By default, +colV corresponds to column 1 +and colH to column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the deduplicated values found in +column colV, in the same +order as in the query results. + + +The horizontal header, displayed as the first row, +contains the deduplicated values found in +column colH, in +the order of appearance in the query results. +If specified, the optional scolH +argument refers to a column whose values should be integer numbers +by which colH will be sorted +to be positioned in the horizontal header. + + + +Inside the crosstab grid, +given a query output with N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +the contents of a cell located at the intersection +(x,y) is determined by these rules: + + + + if there is no corresponding row in the query results such that the + value for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other + columns or the columns listed in + colG1[,colG2...] + are displayed in the cell, separated between each other by + a space character if needed. + + If N=2, the letter X is displayed + in the cell as if a virtual third column contained that character. + + + + + + if there are several corresponding rows, the behavior is identical to +
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On Thu, Feb 18, 2016 at 9:23 AM, Daniel Veritewrote: > Dean Rasheed wrote: > >> If I want to sort the rows coming out of a query, my first thought >> is always going to be to add/adjust the query's ORDER BY clause, not >> use some weird +/- psql syntax. > > About the vertical sort, I agree on all your points. > It's best to rely on ORDER BY for all the reasons mentioned, > as opposed to a separate sort in a second step. > > But you're considering the case when a user is designing > or adapting a query for the purpose of crosstab > viewing. As mentioned in my previous reply (about the > methods to achieve horizontal sort), that scenario is not really > what motivates the feature in the first place. > > If removing that sort option is required to move forward > with the patch because it's controversial, so be it, > but overall I don't see this as a benefit for the end user, > it's just an option. Discussion on this patch seems to have died off. I'm probably not going to win any popularity contests for saying this, but I think we should reject this patch. I don't feel like this is really a psql feature: it's a powerful data visualization tool which we're proposing to jam into psql. I don't think that's psql's purpose. I also think it's quite possible that there could be an unbounded number of slightly different things that people want here, and if we take this one and a couple more, the code for these individual features could come to be larger than all of psql, even though probably 95% of psql users would never use any of those. Now, that having been said, if other people want this feature to go in and are willing to do the work to get it in, I've said my piece and won't complain further. There are a couple of committers who have taken positive interest in this thread, so that's good. However, there are also a couple of committers who have expressed doubts similar to mine, so that's not so good. But worse than either of those things, there is no real agreement on what the overall design of this feature should be. Everybody wants something a little different, for different reasons. If we can't come to an agreement, more or less immediately, on what to try to get into 9.6, then this can't go into this release. Whether it should go into a future release is a question we can leave for another time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Dean Rasheed wrote: > If I want to sort the rows coming out of a query, my first thought > is always going to be to add/adjust the query's ORDER BY clause, not > use some weird +/- psql syntax. About the vertical sort, I agree on all your points. It's best to rely on ORDER BY for all the reasons mentioned, as opposed to a separate sort in a second step. But you're considering the case when a user is designing or adapting a query for the purpose of crosstab viewing. As mentioned in my previous reply (about the methods to achieve horizontal sort), that scenario is not really what motivates the feature in the first place. If removing that sort option is required to move forward with the patch because it's controversial, so be it, but overall I don't see this as a benefit for the end user, it's just an option. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Peter Eisentraut wrote: > On 2/9/16 11:21 AM, Daniel Verite wrote: > > Note that NULL values in the column that pivots are discarded > > by \crosstabview, because NULL as the name of a column does not > > make sense. > > Why not? > > All you're doing is printing it out, and psql is quite capable of > printing a null value. Initially it's by analogy with the crosstab SRF, but it's true that the same principle does not have to apply to crosstabview. The code could set in the header whatever text "pset null" is set to, at the place where a pivoted NULL would be supposed to go if it was not filtered out in the first place. I'll consider implementing that change if there's no objection. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Daniel Verite wrote: > > > ORDER BY name > > > \crosstabview cols = (select to_char(d, 'Mon') from > > > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d) > > > > My concern with that is that often you don't know what the columns will > > be, because you don't know what exact data the query will produce. So to > > use this syntax you'd have to re-create a huge chunk of the original > > query. :( > > Also, if that additional query refers to tables, it should be executed > with the same data visibility as the main query. Doesn't that mean > that both queries should happen within the same repeatable > read transaction? > > Another impractical aspect of this approach is that a > meta-command invocation in psql must fit on a single line, so > queries containing newlines are not acceptable as argument. > This problem exists with "\copy (select...) to ..." already. Thinking more about that, it occurs to me that if the sort must come from a user-supplied bit of SQL, it would be simpler to just direct the user to submit it in the main query, in an additional dedicated column. For instance, to get a specific, separate order on "h", let the user change this: SELECT v, h, c FROM v_data ORDER BY v; into that: SELECT v, h, row_number() over(order by h) as hn, c FROM v_data ORDER BY v; then with a relatively simple modification to the patch, this invocation: \crosstabview v h:hn c would display "h" in the horizontal header ordered by "hn". ISTM this handles two objections raised upthread: 1. The ORDER BY inside OVER() can be augmented with additional clauses such as lc_collate, desc, nulls last, etc... contrary to the controversed "+/-" syntax. 2. a post-sort "backdoor" query is no longer necessary. The drawback for me is that this change doesn't play out with my original scenario for the command, which is to give the ability to scrutinize query results in crosstab mode, playing with variations on what column is pivoted and how headers for both directions get sorted, while ideally not changing _at all_ the original query in the query buffer, but just invoking successive \crosstabview [args] commands with varying arguments. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 2/17/16 9:03 AM, Dean Rasheed wrote: I'm not totally opposed to specifying a column sort order in psql, and perhaps there's a way to support both 'cols' and 'col_order' options in psql, since there are different situations where one or the other might be more useful. Yeah. If there was some magic way to reference the underlying data with your syntax it probably wouldn't be that bad. AIUI normally we're just dumping data into a Portal and there's no option to read back from it, but if the query results were first put in a tuplestore then I suspect it wouldn't be that hard to query against it and produce another result set. What I am opposed to is specifying the row order in psql, because IMO that's something that should be done entirely in the SQL query. +1 -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 2/9/16 11:21 AM, Daniel Verite wrote: > Note that NULL values in the column that pivots are discarded > by \crosstabview, because NULL as the name of a column does not > make sense. Why not? All you're doing is printing it out, and psql is quite capable of printing a null value. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Jim Nasby wrote: > > ORDER BY name > > \crosstabview cols = (select to_char(d, 'Mon') from > > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d) > > My concern with that is that often you don't know what the columns will > be, because you don't know what exact data the query will produce. So to > use this syntax you'd have to re-create a huge chunk of the original > query. :( Also, if that additional query refers to tables, it should be executed with the same data visibility as the main query. Doesn't that mean that both queries should happen within the same repeatable read transaction? Another impractical aspect of this approach is that a meta-command invocation in psql must fit on a single line, so queries containing newlines are not acceptable as argument. This problem exists with "\copy (select...) to ..." already. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 17 February 2016 at 02:32, Jim Nasbywrote: > On 2/11/16 4:21 AM, Dean Rasheed wrote: >> >> Thinking about this some more though, perhaps*sorting* the columns is >> the wrong way to be thinking about it. Perhaps a better approach would >> be to allow the columns to be*listed* (optionally, using a separate >> query). Something like the following (don't get too hung up on the >> syntax): >> >> SELECT name, >> to_char(date, 'Mon') AS month, >> sum(amount) AS amount >> FROM invoices >> GROUP BY 1,2 >> ORDER BY name >> \crosstabview cols = (select to_char(d, 'Mon') from >> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d) > > > My concern with that is that often you don't know what the columns will be, > because you don't know what exact data the query will produce. So to use > this syntax you'd have to re-create a huge chunk of the original query. :( > Yeah, that's a reasonable concern. On the flip side, one of the advantages of the above syntax is that you have absolute control over the columns, whereas with the sort-based syntax you might find some columns missing (e.g., if there were no invoices in August) and that could lead to confusion parsing the results. I'm not totally opposed to specifying a column sort order in psql, and perhaps there's a way to support both 'cols' and 'col_order' options in psql, since there are different situations where one or the other might be more useful. What I am opposed to is specifying the row order in psql, because IMO that's something that should be done entirely in the SQL query. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 15 February 2016 at 14:08, Daniel Veritewrote: > Dean Rasheed wrote: > >> My biggest problem is with the sorting, for all the reasons discussed >> above. There is absolutely no reason for \crosstabview to be >> re-sorting rows -- they should just be left in the original query >> result order > > Normal top-down display: > > select v,to_char(d,'Mon') as m, c from v_data order by d; > > v | m | c > +-+- > v2 | Jan | bar > v1 | Apr | foo > v1 | Jul | baz > v0 | Jul | qux > > At this point, it seems to me that it's perfectly reasonable for our user > to expect the possibility of sorting additionally by "v" , without > changing the query and without changing the order of the horizontal > header: > > \crosstabview +v m c > > v | Jan | Apr | Jul > +-+-+- > v0 | | | qux > v1 | | foo | baz > v2 | bar | | > I don't find that example particularly compelling. If I want to sort the rows coming out of a query, my first thought is always going to be to add/adjust the query's ORDER BY clause, not use some weird +/- psql syntax. The crux of the problem here is that in a pivoted query resultset SQL can be used to control the order of the rows or the columns, but not both at the same time. IMO it is more natural to use SQL to control the order of the rows. The columns are the result of the psql pivoting, so it's reasonable to control them via psql options. A couple of other points to bear in mind: The number of columns is always going to be quite limited (at most 1600, and usually far less than that), whereas the number of rows could be arbitrarily large. So sorting the rows client-side in the way that you are could get very inefficient, whereas that's not such a problem for the columns. The column values are non-NULL, so they require a more limited set of sort options, whereas the rows could be anything, and people will want all the sort options to be available. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 2/11/16 4:21 AM, Dean Rasheed wrote: Thinking about this some more though, perhaps*sorting* the columns is the wrong way to be thinking about it. Perhaps a better approach would be to allow the columns to be*listed* (optionally, using a separate query). Something like the following (don't get too hung up on the syntax): SELECT name, to_char(date, 'Mon') AS month, sum(amount) AS amount FROM invoices GROUP BY 1,2 ORDER BY name \crosstabview cols = (select to_char(d, 'Mon') from generate_series('2000-01-01'::date, '2000-12-01', '1 month') d) My concern with that is that often you don't know what the columns will be, because you don't know what exact data the query will produce. So to use this syntax you'd have to re-create a huge chunk of the original query. :( -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Dean Rasheed wrote: > My biggest problem is with the sorting, for all the reasons discussed > above. There is absolutely no reason for \crosstabview to be > re-sorting rows -- they should just be left in the original query > result order We want the option to sort the vertical the header in a late additional step when the ORDER BY of the query is already assigned to another purpose. I've submitted this example on the wiki: https://wiki.postgresql.org/wiki/Crosstabview create view v_data as select * from ( values ('v1','h2','foo', '2015-04-01'::date), ('v2','h1','bar', '2015-01-02'), ('v1','h0','baz', '2015-07-12'), ('v0','h4','qux', '2015-07-15') ) as l(v,h,c,d); Normal top-down display: select v,to_char(d,'Mon') as m, c from v_data order by d; v | m | c +-+- v2 | Jan | bar v1 | Apr | foo v1 | Jul | baz v0 | Jul | qux Crosstabview display without any additional sort: \crosstabview v m c v | Jan | Apr | Jul +-+-+- v2 | bar | | v1 | | foo | baz v0 | | | qux "d" is not present the resultset but it drives the sort so that month names come out in the natural order. \crosstabview does not discard the order of colH nor the order of colV, it follows both, so that we get v2,v1,v0 in this order in the leftmost column (vertical header) just like in the resultset. At this point, it seems to me that it's perfectly reasonable for our user to expect the possibility of sorting additionally by "v" , without changing the query and without changing the order of the horizontal header: \crosstabview +v m c v | Jan | Apr | Jul +-+-+- v0 | | | qux v1 | | foo | baz v2 | bar | | Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > So please can we have that wiki page so that the syntax can be hammered > out a bit more. I've added a wiki page with explanation and examples here: https://wiki.postgresql.org/wiki/Crosstabview Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 2016-02-09 09:27:04 +, Dean Rasheed wrote: > Looking at this patch, I have mixed feelings about it. On the one hand > I really like the look of the output, and I can see that the non-fixed > nature of the output columns makes this hard to achieve server-side. > But on the other hand, this seems to be going way beyond the normal > level of result formatting that something like \x does, and I find the > syntax for sorting particularly ugly. I've pretty similar doubts. Addinging features to psql which are complex enough that it's likely that people will be forced to parse psql output... On the other hand, a proper server side solution won't be easy; so maybe this is a okay enough stopgap. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 11 February 2016 at 08:43, Andres Freundwrote: > On 2016-02-09 09:27:04 +, Dean Rasheed wrote: >> Looking at this patch, I have mixed feelings about it. On the one hand >> I really like the look of the output, and I can see that the non-fixed >> nature of the output columns makes this hard to achieve server-side. > >> But on the other hand, this seems to be going way beyond the normal >> level of result formatting that something like \x does, and I find the >> syntax for sorting particularly ugly. > > I've pretty similar doubts. Addinging features to psql which are complex > enough that it's likely that people will be forced to parse psql > output... On the other hand, a proper server side solution won't be > easy; so maybe this is a okay enough stopgap. > Well to be clear, I like the idea of this feature, and I'm not trying to stand in the way of progressing it. However, I can't see myself committing it in its current form. My biggest problem is with the sorting, for all the reasons discussed above. There is absolutely no reason for \crosstabview to be re-sorting rows -- they should just be left in the original query result order. Sorting columns is a little more understandable, since there is no way for the original query to control the order in which the colV values come out, but Tom raises a good point -- there are far too many bells and whistles when it comes to sorting, and we don't want to be adding all of them to the psql syntax. Thinking about this some more though, perhaps *sorting* the columns is the wrong way to be thinking about it. Perhaps a better approach would be to allow the columns to be *listed* (optionally, using a separate query). Something like the following (don't get too hung up on the syntax): SELECT name, to_char(date, 'Mon') AS month, sum(amount) AS amount FROM invoices GROUP BY 1,2 ORDER BY name \crosstabview cols = (select to_char(d, 'Mon') from generate_series('2000-01-01'::date, '2000-12-01', '1 month') d) Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Thinking about this some more though, perhaps *sorting* the columns is > the wrong way to be thinking about it. Perhaps a better approach would > be to allow the columns to be *listed* (optionally, using a separate > query). Something like the following (don't get too hung up on the > syntax): > > SELECT name, >to_char(date, 'Mon') AS month, >sum(amount) AS amount > FROM invoices > GROUP BY 1,2 > ORDER BY name > \crosstabview cols = (select to_char(d, 'Mon') from > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d) > The idea is ok, but this design cannot be described as user friendly. The work with time dimension is pretty common, and should be supported by some short user friendly syntax. Regards Pavel > > Regards, > Dean >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi Looking at this patch, I have mixed feelings about it. On the one hand > I really like the look of the output, and I can see that the non-fixed > nature of the output columns makes this hard to achieve server-side. > > But on the other hand, this seems to be going way beyond the normal > level of result formatting that something like \x does, and I find the > syntax for sorting particularly ugly. I can understand the need to > sort the colH values, but it seems to me that the result rows should > just be returned in the order the server returns them -- i.e., I don't > think we should allow sorting colV values client-side, overriding a > server-side ORDER BY clause in the query. > This feature has zero relation with \x option, and any link to this option is confusing. This is important, elsewhere we are on start again, where I did long discuss with Daniel about the name, when I blocked the name "rotate". > Client-side sorting makes me uneasy in general, and I think it should > be restricted to just sorting the columns that appear in the output > (the colH values). This would also allow the syntax to be simplified: > > \crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]] > The sorting on client side is necessary - minimally in one direction, because you cannot to create perfect sorting for both dimensions. Possibility to order in second dimension is just pretty comfortable - because you don't need to think two steps forward - when you create SQL query. I have a basic use case that should be supported well, and it is supported well by last version of this patch. The evaluation of syntax is subjective. We can compare Daniel's syntax and your proposal. The use case: I have a table with the invoices with attributes (date, name and amount). I would to take a report of amounts across months and customers. Horizontal dimension is month (name), vertical dimension is name of customers. I need sorting of months in semantic order and customers in alphabet order. So my query is: SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3; and crosstabview command (per Daniel proposal) \crosstabview +name +month:month_order amount But if I don't need column header in human readable form, I can do \crosstabview +name +month_order amount What is solution of this use case with your proposal?? I agree so this syntax is pretty raw. But it is consistent with other psql statements and there are not possible conflicts. What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order column, I have to enter one or more colG1,... or I have to enter explicitly asc, desc keyword. Regards Pavel > > Overall, I like the feature, but I'm not convinced that it's ready in > its current form. > > For the future (not in this first version of the patch), since the > transformation is more than just a \x-type formatting of the query > results, a nice-to-have feature would be a way to save the results > somewhere -- say by making it play nicely with \g or \copy somehow, > but I admit that I don't know exactly how that would work. > > Regards, > Dean >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 9 February 2016 at 05:24, Pavel Stehulewrote: > I have not a feeling so we did some with Daniel privately. All work was > public (I checked my mailbox) - but what is unhappy - in more mailing list > threads (not sure how it is possible, because subjects looks same). The > discus about the design was public, I am sure. It was relative longer > process, with good progress (from my perspective), because Daniel accepts > and fixed all my objection. The proposed syntax is fully consistent with > other psql commands - hard to create something new there, because psql > parser is pretty limited. Although I am thinking so syntax is good, clean > and useful I am open to discuss about it. Please, try the last design, last > patch - I spent lot of hours (and I am sure so Daniel much more) in thinking > how this can be designed better. > Looking at this patch, I have mixed feelings about it. On the one hand I really like the look of the output, and I can see that the non-fixed nature of the output columns makes this hard to achieve server-side. But on the other hand, this seems to be going way beyond the normal level of result formatting that something like \x does, and I find the syntax for sorting particularly ugly. I can understand the need to sort the colH values, but it seems to me that the result rows should just be returned in the order the server returns them -- i.e., I don't think we should allow sorting colV values client-side, overriding a server-side ORDER BY clause in the query. Client-side sorting makes me uneasy in general, and I think it should be restricted to just sorting the columns that appear in the output (the colH values). This would also allow the syntax to be simplified: \crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]] Overall, I like the feature, but I'm not convinced that it's ready in its current form. For the future (not in this first version of the patch), since the transformation is more than just a \x-type formatting of the query results, a nice-to-have feature would be a way to save the results somewhere -- say by making it play nicely with \g or \copy somehow, but I admit that I don't know exactly how that would work. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 9 February 2016 at 10:09, Pavel Stehulewrote: > The sorting on client side is necessary - minimally in one direction, > because you cannot to create perfect sorting for both dimensions. > Possibility to order in second dimension is just pretty comfortable - > because you don't need to think two steps forward - when you create SQL > query. > > I have a basic use case that should be supported well, and it is supported > well by last version of this patch. The evaluation of syntax is subjective. > We can compare Daniel's syntax and your proposal. > > The use case: I have a table with the invoices with attributes (date, name > and amount). I would to take a report of amounts across months and > customers. Horizontal dimension is month (name), vertical dimension is name > of customers. I need sorting of months in semantic order and customers in > alphabet order. > > So my query is: > > SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS > month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3; > > and crosstabview command (per Daniel proposal) > > \crosstabview +name +month:month_order amount > > But if I don't need column header in human readable form, I can do > > \crosstabview +name +month_order amount > > What is solution of this use case with your proposal?? > So it would just be SELECT name, to_char(date, 'mon') AS month, sum(amount) AS amount, extract(month from date) AS month_order FROM invoices GROUP BY 1,2,3 ORDER BY name \crosstabview name month amount month_order Note that I might also want to pass additional sort options, such as "ORDER BY name NULLS LAST", which the existing syntax doesn't allow. In the new syntax, such sort options could be trivially supported in both the server- and client-side sorts: SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3 ORDER BY name NULLS LAST \crosstabview name month amount month_order asc nulls last This is probably not an issue in this example, but it might well be in other cases. The +/-scol syntax is always going to be limited in what it can support. > I agree so this syntax is pretty raw. But it is consistent with other psql > statements and there are not possible conflicts. > > What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH] > [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order > column, I have to enter one or more colG1,... or I have to enter explicitly > asc, desc keyword. > That is resolved by the comma that precedes colG2, etc. isn't it? Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
> > > > SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS > > month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3; > > > > and crosstabview command (per Daniel proposal) > > > > \crosstabview +name +month:month_order amount > > > > But if I don't need column header in human readable form, I can do > > > > \crosstabview +name +month_order amount > > > > What is solution of this use case with your proposal?? > > > > So it would just be > > SELECT name, >to_char(date, 'mon') AS month, >sum(amount) AS amount, >extract(month from date) AS month_order > FROM invoices > GROUP BY 1,2,3 > ORDER BY name > \crosstabview name month amount month_order > Warning: :) Now I am subjective. The Daniel syntax "\crosstabview +name +month:month_order amount" looks more readable for me, because related things are near to self. > > Note that I might also want to pass additional sort options, such as > "ORDER BY name NULLS LAST", which the existing syntax doesn't allow. > In the new syntax, such sort options could be trivially supported in > both the server- and client-side sorts: > SELECT name, to_char(date, 'mon') AS month, >extract(month from date) AS month_order, sum(amount) AS amount > FROM invoices > GROUP BY 1,2,3 > ORDER BY name NULLS LAST > \crosstabview name month amount month_order asc nulls last > I understand - if I compare these two syntaxes I and I am trying be objective, then I see your: + respect SQL clauses ordering, allows pretty complex ORDER BY clause - possible to fail on unexpected syntax errors +/- more verbose - allow only one client side sort - less expressive Daniel: + cannot to fail on syntax error + more compacts (not necessary to specify ORDER BY clauses) + allow to specify sort in both dimensions + more expressive (+colH is more expressive than colV colH col colH - doesn't allow to complex order clauses in both dimensions > > This is probably not an issue in this example, but it might well be in > other cases. The +/-scol syntax is always going to be limited in what > it can support. > the +/- syntax can be enhanced by additional attributes - this is only syntax (but then there is a risk of possible syntax errors) > > > > I agree so this syntax is pretty raw. But it is consistent with other > psql > > statements and there are not possible conflicts. > > > > What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH] > > [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort > order > > column, I have to enter one or more colG1,... or I have to enter > explicitly > > asc, desc keyword. > > > > That is resolved by the comma that precedes colG2, etc. isn't it? > but colG1 is optional. What if you miss any colGx ? Regards Pavel > > Regards, > Dean >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > So please can we have that wiki page so that the syntax can be hammered > out a bit more. Sure, I'm on it. > I'm closing this as returned-with-feedback for now. Well, the feedback it got during months was incorporated into the patch in the form of significant improvements, and at the end of this CF it was at the point that it has really been polished, and no other feedback was coming. I'll resubmit. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > Also, what about the business of putting "x" if there's no third column? > Three months from now some Czech psql hacker will say "we should use > Unicode chars for this" and we will be forever stuck with \pset > unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH > CZECH. Maybe we should think that a bit harder -- for example, what > about just rejecting the case with no third column and forcing the user > to add a third column with the character they choose? That way you > avoid that mess. Yes, that implicit "X" with 2 column resultsets is not essential, it may be removed without real damage. About the possible suggestion to have a \pset unicode_crosstab_marker, my opinion would be that it's not important enough to justify a new \pset setting. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Alvaro Herrera wrote: > While I understand that you may think that "silence is consent", > what I am afraid of is that some committer will look at this two > months from now and say "I hate this Hcol+ stuff, -1 from me" and > send the patch back for syntax rework. IMO it's better to have more > people chime in here so that the patch that we discuss during the > next commitfest is really the best one we can think of. Yes, but on the other hand we can't force people to participate. If a patch is moving forward and being discussed here between one author and one reviewer, and nothing particularly wrong pops out in what is discussed, the reality if that other people will not intervene. Besides, as it being mentioned here frequently, all patches, even much more important ones, are short on reviews and reviewers and testing, still new stuff must keep getting in the source tree to progress. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 9 February 2016 at 11:06, Pavel Stehulewrote: > + respect SQL clauses ordering, allows pretty complex ORDER BY clause That, to me is the key point. SQL already allows very powerful sorting, so psql should not just throw away the query's sort order and replace it with something much more basic and limited. The exact syntax can be debated, but I don't think psql should be doing row sorting. I also don't believe that extending the +/- sort syntax to support more advanced options will be particularly easy, and the result is likely to be even less readable. It also requires the user to learn another syntax, when they will already be familiar with SQL's sort syntax. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Dean Rasheed wrote: > I don't think we should allow sorting colV values client-side, > overriding a server-side ORDER BY clause in the query. I shared that opinion until (IIRC) the v8 or v9 of the patch. Most of the evolution of this patch has been to go from no client-side sorting option at all, to the full range of possibilities, ascending or descending, and in both vertical and horizontal directions. I agree that colV sorting can be achieved through the query's ORDER BY, which additionally is more efficient so it should be the primary choice. The reason to allow [+/-]colV in \crosstabview is because I think the average user will expect it, by symmetry with colH. As the display is reorganized to be like a "grid" instead of a "list with several columns", we shift the focus to the symmetry between horizontal and vertical headers, rather than on the pre-crosstab form of the resultset, even if it's the same data. It's easier for the user to just stick a + in front of a column reference than to figure out that the same result could be achieved by editing the query and changing/adding an ORDER BY. Or said otherwise, having the [+/-] colV sorting is a way to avoid the question: "we can sort the horizontal header, so why can't we sort the vertical header just the same?" Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Dean Rasheed wrote: > Note that I might also want to pass additional sort options, such as > "ORDER BY name NULLS LAST", which the existing syntax doesn't allow. > In the new syntax, such sort options could be trivially supported in > both the server- and client-side sorts: Note that NULL values in the column that pivots are discarded by \crosstabview, because NULL as the name of a column does not make sense. The doc (in the patch) says: "The horizontal header, displayed as the first row, contains the set of all distinct non-null values found in column colH" Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
"Daniel Verite"writes: > Dean Rasheed wrote: >> I don't think we should allow sorting colV values client-side, >> overriding a server-side ORDER BY clause in the query. > I shared that opinion until (IIRC) the v8 or v9 of the patch. > Most of the evolution of this patch has been to go > from no client-side sorting option at all, to the full range > of possibilities, ascending or descending, and in both > vertical and horizontal directions. I haven't been paying attention to this thread ... but it is sure sounding like this feature has gotten totally out of hand. Suggest reconsidering your design goals. > Or said otherwise, having the [+/-] colV sorting is a way to > avoid the question: > "we can sort the horizontal header, so why can't we sort the > vertical header just the same?" I would turn that around, and ask why not remove *both* those things. I do not think we want any client-side sorting in this feature at all, because the minute you have any such thing, you are going to have an absolutely never-ending stream of demands for more sorting features: multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc. I'd rather reject the feature altogether than expect that psql is going to have to grow all of that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
I haven't been paying attention to this thread ... but it is sure > sounding like this feature has gotten totally out of hand. Suggest > reconsidering your design goals. > > > Or said otherwise, having the [+/-] colV sorting is a way to > > avoid the question: > > "we can sort the horizontal header, so why can't we sort the > > vertical header just the same?" > > I would turn that around, and ask why not remove *both* those things. > > I do not think we want any client-side sorting in this feature at all, > because the minute you have any such thing, you are going to have an > absolutely never-ending stream of demands for more sorting features: > multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc. > I'd rather reject the feature altogether than expect that psql is going > to have to grow all of that. > I am thinking so without possibility to sort data on client side, this feature will be significantly limited. You cannot do server side sort for both dimensions. Working with 2d report when one dimension is unsorted is not friendly. But the client side sorting can be limited to number's or C locale sorting. I don't think so full sort possibilities are necessary. Regards Pavel > regards, tom lane >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
On 2/9/16 8:40 AM, Daniel Verite wrote: Alvaro Herrera wrote: While I understand that you may think that "silence is consent", what I am afraid of is that some committer will look at this two months from now and say "I hate this Hcol+ stuff, -1 from me" and send the patch back for syntax rework. IMO it's better to have more people chime in here so that the patch that we discuss during the next commitfest is really the best one we can think of. Yes, but on the other hand we can't force people to participate. If a patch is moving forward and being discussed here between one author and one reviewer, and nothing particularly wrong pops out in what is discussed, the reality if that other people will not intervene. The problem is that assumes people are still reading the thread. This is a feature I'm very interested, but at some point I just gave up on trying to follow it because of the volume of messages. I bet a lot of others did the same. I think in this case, what should have happened is that once an issue with the design of the feature itself was identified, a new thread should have been started to discuss that part in particular. That would have re-raised attention and made it easier for people to follow that specific part of the discussion, even if they don't care about some if the code intricacies. Besides, as it being mentioned here frequently, all patches, even much more important ones, are short on reviews and reviewers and testing, still new stuff must keep getting in the source tree to progress. Sure, and new stuff will be making it in. The question is: will *your* new stuff be making it in? Believe me, I know how burdensome getting new features pushed is. Frankly it shouldn't be this hard, and I certainly don't blame you for being frustrated. But none of that changes the fact that the bar for including code is very high and if you don't meet it then your stuff won't make it in. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Tom Lane wrote: > I do not think we want any client-side sorting in this feature at all, > because the minute you have any such thing, you are going to have an > absolutely never-ending stream of demands for more sorting features: > multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc. It doesn't really do any client-side sorting, the rest of the thread might refer to it like that by oversimplification, but if the command requests a header to be sorted, a "backdoor-style" query of this form is sent to the server, with PQexecParams(): SELECT n FROM (VALUES ($1,1),($2,2),($3,3)...) ) AS l(x,n) ORDER BY x [DESC] where the values to display in the header are bound to $1,$2,.. and the type associated with these parameters is the PQftype() of the field from which these values come. Then the values coming back ordered by tell us where to position the values corresponding to $1,$2... in the sorted header. There are some cases when this sort cannot work. For example if the field is an anonymous type or a ROW(). Or if the field is POINT(x,y), because our "point" type does not support order by. I believe these are corner cases for this feature. In these cases, psql just displays the error message that PQexecParams() emits. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi > I just rechecked the thread. In my reading, lots of people argued > whether it should be called \rotate or \pivot or \crosstab; it seems the > \crosstabview proposal was determined to be best. I can support that > decision. But once the details were discussed, it was only you and > Daniel left in the thread; nobody else participated. While I understand > that you may think that "silence is consent", what I am afraid of is > that some committer will look at this two months from now and say "I > hate this Hcol+ stuff, -1 from me" and send the patch back for syntax > rework. IMO it's better to have more people chime in here so that the > patch that we discuss during the next commitfest is really the best one > we can think of. > I have not a feeling so we did some with Daniel privately. All work was public (I checked my mailbox) - but what is unhappy - in more mailing list threads (not sure how it is possible, because subjects looks same). The discus about the design was public, I am sure. It was relative longer process, with good progress (from my perspective), because Daniel accepts and fixed all my objection. The proposed syntax is fully consistent with other psql commands - hard to create something new there, because psql parser is pretty limited. Although I am thinking so syntax is good, clean and useful I am open to discuss about it. Please, try the last design, last patch - I spent lot of hours (and I am sure so Daniel much more) in thinking how this can be designed better. > Also, what about the business of putting "x" if there's no third column? > Three months from now some Czech psql hacker will say "we should use > Unicode chars for this" and we will be forever stuck with \pset > unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH > CZECH. Maybe we should think that a bit harder -- for example, what > about just rejecting the case with no third column and forcing the user > to add a third column with the character they choose? That way you > avoid that mess. > These features are in category "nice to have". There are no problem to do in last commitfest or in next release cycle. It is not reason why to block commit of this feature, and I am sure so lot of users can be pretty happy with "basic" version of this patch. The all necessary functionality is there and working. We can continue on development in other cycles, but for this cycle, I am sure, so all necessary work is done. > > > This feature has only small relation to SQL PIVOTING feature - it is just > > form of view and I am agree with Daniel about sense of this feature. > > Yes, I don't disagree there. Robert Haas and David Fetter also > expressed their support for psql-side processing, so I think we're good > there. > > great. Personally, I have not any objection against current state. If anybody has, please do it early. We can move to forward. This is nice feature, good patch, and there is not reason why stop here. Regards Pavel > -- > Álvaro Herrerahttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi! Interesting feature, but it's not very obvious how to use it. I'd like to see some example(s) in documentation. And I see an implementation of AVL tree in psql source code (src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree implementation in src/include/lib/rbtree.h and src/backend/lib/rbtree.c. Is any advantage of using AVL tree here? I have some doubt about that and this implementation, obviously, will not be well tested. But I see in comments that implementation is reduced to insert only and it doesn't use the fact of ordering tree, so, even hash could be used. Daniel Verite wrote: Pavel Stehule wrote: 1. maybe we can decrease name to shorter "crossview" ?? I am happy with crosstabview too, just crossview is correct too, and shorter I'm in favor of keeping crosstabview. It's more explicit, only 3 characters longer and we have tab completion anyway. 2. Columns used for ordering should not be displayed by default. I can live with current behave, but hiding ordering columns is much more practical for me I can see why, but I'm concerned about a consequence: say we have 4 columns A,B,C,D and user does \crosstabview +A:B +C:D If B and D are excluded by default, then there's nothing left to display inside the grid. It doesn't feel quite right. There's something counter-intuitive in the fact that values in the grid would disappear depending on whether and how headers are sorted. With the 3rd argument, we let the user decide what they want to see. 3. This code is longer, so some regress tests are recommended - attached simple test case I've added a few regression tests to the psql testsuite based on your sample data. New patch with these tests included is attached, make check passes. Best regards, -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Teodor Sigaev wrote: > Interesting feature, but it's not very obvious how to use it. I'd like to > see some example(s) in documentation. I'm thinking of making a wiki page, because examples pretty much require showing resultsets, and I'm not sure this would fly with our current psql documentation, which is quite compact. The current bit of doc I've produced is 53 lines long in manpage format already. The text has not been proofread by a native English speaker yet, so part of the problem might be that it's just me struggling with english :) > And I see an implementation of AVL tree in psql source code > (src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree > implementation in src/include/lib/rbtree.h and > src/backend/lib/rbtree.c. Is any advantage of using AVL tree here? I > have some doubt about that and this implementation, obviously, will > not be well tested. But I see in comments that implementation is > reduced to insert only and it doesn't use the fact of ordering tree, > so, even hash could be used. Yes. I expect too that a RB tree or a hash-based algorithm would do the job and perform well. The AVL implementation in crosstabview is purposely simplified and specialized for this job, resulting in ~185 lines of code versus ~850 lines for rb-tree.c But I understand the argument that the existing rb-tree has been battle-tested, whereas this code hasn't. I'm looking at rb-tree.c and thinking what it would take to incorporate it: 1. duplicating or linking from backend/lib/rb-tree.c into psql/ 2. replacing the elog() calls with something else in the case of psql 3. updating the crosstabview data structures and call sites. While I'm OK with #3, #1 and #2 seem wrong. I could adapt rb-tree.c so that the same code can be used both client-side and server-side, but touching server-side code for this feature and creating links in the source tree feels invasive and overkill. Another approach is to replace AVL with an hash-based algorithm, but that raises the same sort of question. If crosstabview comes with its specific implementation, why use that rather than existing server-side code? But at a glance, utils/hash/dynahash.c seems quite hard to convert for client-side use. I'm open to ideas on this. In particular, if we have a hash table implementation that is already blessed by the project and small enough to make sense in psql, I'd be happy to consider it. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi > FWIW I think the general idea of this feature (client-side resultset > "pivoting") is a good one, but I don't really have an opinion regarding > your specific proposal. I think you should first seek some more > consensus about the proposed design; in your original thread [1] several > guys defended the idea of having this be a psql feature, and the idea of > this being a parallel to \x seems a very sensible one, but there's > really been no discussion on whether your proposed "+/-" syntax to > change sort order makes sense, for one thing. > I am sorry, but I disagree - the discussion about implementation was more than two months, and I believe so anybody who would to discuss had enough time to discuss. This feature and design was changed significantly and there was not anybody who sent feature design objection. This feature has only small relation to SQL PIVOTING feature - it is just form of view and I am agree with Daniel about sense of this feature. Regards Pavel > > So please can we have that wiki page so that the syntax can be hammered > out a bit more. > > I'm closing this as returned-with-feedback for now. > > [1] It's a good idea to add links to previous threads where things were > discussed. I had to search for > www.postgresql.org/message-id/flat/78543039-c708-4f5d-a66f-0c0fbcda1f76@mm > because you didn't provide a link to it when you started the second > thread. > > -- > Álvaro Herrerahttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Pavel Stehule wrote: > > FWIW I think the general idea of this feature (client-side resultset > > "pivoting") is a good one, but I don't really have an opinion regarding > > your specific proposal. I think you should first seek some more > > consensus about the proposed design; in your original thread [1] several > > guys defended the idea of having this be a psql feature, and the idea of > > this being a parallel to \x seems a very sensible one, Sorry, I meant \q here, not \x. > > but there's really been no discussion on whether your proposed "+/-" > > syntax to change sort order makes sense, for one thing. > > I am sorry, but I disagree - the discussion about implementation was more > than two months, and I believe so anybody who would to discuss had enough > time to discuss. This feature and design was changed significantly and > there was not anybody who sent feature design objection. I just rechecked the thread. In my reading, lots of people argued whether it should be called \rotate or \pivot or \crosstab; it seems the \crosstabview proposal was determined to be best. I can support that decision. But once the details were discussed, it was only you and Daniel left in the thread; nobody else participated. While I understand that you may think that "silence is consent", what I am afraid of is that some committer will look at this two months from now and say "I hate this Hcol+ stuff, -1 from me" and send the patch back for syntax rework. IMO it's better to have more people chime in here so that the patch that we discuss during the next commitfest is really the best one we can think of. Also, what about the business of putting "x" if there's no third column? Three months from now some Czech psql hacker will say "we should use Unicode chars for this" and we will be forever stuck with \pset unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH CZECH. Maybe we should think that a bit harder -- for example, what about just rejecting the case with no third column and forcing the user to add a third column with the character they choose? That way you avoid that mess. > This feature has only small relation to SQL PIVOTING feature - it is just > form of view and I am agree with Daniel about sense of this feature. Yes, I don't disagree there. Robert Haas and David Fetter also expressed their support for psql-side processing, so I think we're good there. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Daniel Verite wrote: > Teodor Sigaev wrote: > > > Interesting feature, but it's not very obvious how to use it. I'd like to > > see some example(s) in documentation. > > I'm thinking of making a wiki page, because examples pretty much > require showing resultsets, and I'm not sure this would fly > with our current psql documentation, which is quite compact. Yeah, we need to keep in mind that the psql doc is processed as a manpage also, so it may not be a great idea to add too many things there. But I also agree that some good examples would be useful. FWIW I think the general idea of this feature (client-side resultset "pivoting") is a good one, but I don't really have an opinion regarding your specific proposal. I think you should first seek some more consensus about the proposed design; in your original thread [1] several guys defended the idea of having this be a psql feature, and the idea of this being a parallel to \x seems a very sensible one, but there's really been no discussion on whether your proposed "+/-" syntax to change sort order makes sense, for one thing. So please can we have that wiki page so that the syntax can be hammered out a bit more. I'm closing this as returned-with-feedback for now. [1] It's a good idea to add links to previous threads where things were discussed. I had to search for www.postgresql.org/message-id/flat/78543039-c708-4f5d-a66f-0c0fbcda1f76@mm because you didn't provide a link to it when you started the second thread. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi I tested last version, v11 and I have not any objection It is working as expected all regress tests passed, there is related documentation and new test is attached. This patch is ready form commiter. Daniel, thank you very much, it is interesting feature. Regards Pavel
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Pavel Stehule wrote: > 1. maybe we can decrease name to shorter "crossview" ?? I am happy with > crosstabview too, just crossview is correct too, and shorter I'm in favor of keeping crosstabview. It's more explicit, only 3 characters longer and we have tab completion anyway. > 2. Columns used for ordering should not be displayed by default. I can live > with current behave, but hiding ordering columns is much more practical for > me I can see why, but I'm concerned about a consequence: say we have 4 columns A,B,C,D and user does \crosstabview +A:B +C:D If B and D are excluded by default, then there's nothing left to display inside the grid. It doesn't feel quite right. There's something counter-intuitive in the fact that values in the grid would disappear depending on whether and how headers are sorted. With the 3rd argument, we let the user decide what they want to see. > 3. This code is longer, so some regress tests are recommended - attached > simple test case I've added a few regression tests to the psql testsuite based on your sample data. New patch with these tests included is attached, make check passes. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 6d0cb3d..a242ec4 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -990,6 +990,123 @@ testdb= +\crosstabview [ +[-|+]colV +[:scolV] +[-|+]colH +[:scolH] +[colG1[,colG2...]] +] + + +Execute the current query buffer (like \g) and shows +the results inside a crosstab grid. +The output column colV +becomes a vertical header, optionally sorted by scolV, +and the output column colH +becomes a horizontal header, optionally sorted by +scolH. + +colG1[,colG2...] +is the list of output columns to project into the grid. +By default, all output columns of the query except +colV and +colH +are included in this list. + + + +All columns can be refered to by their position (starting at 1), or by +their name. Normal case folding and quoting rules apply on column +names. By default, +colV corresponds to column 1 +and colH to column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the set of all distinct values found in +column colV, in the order +of their first appearance in the query results, +or in ascending order if a plus (+) sign precedes +colV, +or in descending order if it's a minus (-) sign. + + +The horizontal header, displayed as the first row, +contains the set of all distinct non-null values found in +column colH. They come +by default in their order of appearance in the query results, +or in ascending order if a plus (+) sign precedes +colH, +or in descending order if it's a minus (-) sign. + +Also, they can optionally be sorted by another column, if +colV +(respectively colH) is +immediately followed by a colon and a reference to another column +scolV +(respectively scolH). + + + +Inside the crosstab grid, +given a query output with N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +the contents of a cell located at the intersection +(x,y) is determined by these rules: + + + + if there is no corresponding row in the query results such that the + value for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other + columns or the columns listed in + colG1[,colG2...] + are displayed in the cell, separated between each other by + a space character if needed. + + If N=2, the letter X is displayed + in the cell as if a virtual third column contained that character. + + + + + + if there are several corresponding rows, the behavior is identical to + the case of one row except that the values coming from different rows + are stacked vertically, the different source rows being separated by + newline characters inside the cell. +
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi 2016-01-22 19:53 GMT+01:00 Daniel Verite: > Hi, > > Here's an updated patch improving on how the horizontal and vertical > headers can be sorted. > > The discussion upthread went into how it was desirable > to have independant sorts for these headers, possibly driven > by another column, in addition to the query's ORDER BY. > > Thus the options now accepted are: > > \crosstabview [ [-|+]colV[:scolV] [-|+]colH[:scolH] [colG1[,colG2...]] ] > > The optional scolV/scolH columns drive sorts for respectively > colV/colH (colV:scolV somehow means SELECT colV from... order by scolV) > > colG1,... in 3rd arg indicate the columns whose contents form the grid > cells, the typical use case being that there's only one such column. > By default it's all columns minus colV and colH. > > For example, > > SELECT > cust_id, > cust_name, > cust_date, > date_part('month, sales_date), > to_char(sales_date, 'Mon') as month, > amount > FROM sales_view > WHERE [predicates] > [ORDER BY ...] > > If we want to look at in a grid with months names across, sorted > by month number, and customer name in the vertical header, sorted by date > of > acquisition, we could do this: > > \crosstabview +cust_name:cust_date +5:4 amount > > or letting the vertical header being sorted by the query's ORDER BY, > and the horizontal header same as above: > > \crosstabview cust_name +5:4 amount > > or sorting vertically by name, if it happens that the ORDER BY is missing > or > is on something else: > > \crosstabview +cust_name +5:4 amount > I am playing with this patch, and I have following comments: 1. maybe we can decrease name to shorter "crossview" ?? I am happy with crosstabview too, just crossview is correct too, and shorter 2. Columns used for ordering should not be displayed by default. I can live with current behave, but hiding ordering columns is much more practical for me 3. This code is longer, so some regress tests are recommended - attached simple test case Regards Pavel > > Best regards, > -- > Daniel Vérité > PostgreSQL-powered mailer: http://www.manitou-mail.org > Twitter: @DanielVerite > regresstest.sql Description: application/sql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi, Here's an updated patch improving on how the horizontal and vertical headers can be sorted. The discussion upthread went into how it was desirable to have independant sorts for these headers, possibly driven by another column, in addition to the query's ORDER BY. Thus the options now accepted are: \crosstabview [ [-|+]colV[:scolV] [-|+]colH[:scolH] [colG1[,colG2...]] ] The optional scolV/scolH columns drive sorts for respectively colV/colH (colV:scolV somehow means SELECT colV from... order by scolV) colG1,... in 3rd arg indicate the columns whose contents form the grid cells, the typical use case being that there's only one such column. By default it's all columns minus colV and colH. For example, SELECT cust_id, cust_name, cust_date, date_part('month, sales_date), to_char(sales_date, 'Mon') as month, amount FROM sales_view WHERE [predicates] [ORDER BY ...] If we want to look at in a grid with months names across, sorted by month number, and customer name in the vertical header, sorted by date of acquisition, we could do this: \crosstabview +cust_name:cust_date +5:4 amount or letting the vertical header being sorted by the query's ORDER BY, and the horizontal header same as above: \crosstabview cust_name +5:4 amount or sorting vertically by name, if it happens that the ORDER BY is missing or is on something else: \crosstabview +cust_name +5:4 amount Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 6d0cb3d..a242ec4 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -990,6 +990,123 @@ testdb= +\crosstabview [ +[-|+]colV +[:scolV] +[-|+]colH +[:scolH] +[colG1[,colG2...]] +] + + +Execute the current query buffer (like \g) and shows +the results inside a crosstab grid. +The output column colV +becomes a vertical header, optionally sorted by scolV, +and the output column colH +becomes a horizontal header, optionally sorted by +scolH. + +colG1[,colG2...] +is the list of output columns to project into the grid. +By default, all output columns of the query except +colV and +colH +are included in this list. + + + +All columns can be refered to by their position (starting at 1), or by +their name. Normal case folding and quoting rules apply on column +names. By default, +colV corresponds to column 1 +and colH to column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the set of all distinct values found in +column colV, in the order +of their first appearance in the query results, +or in ascending order if a plus (+) sign precedes +colV, +or in descending order if it's a minus (-) sign. + + +The horizontal header, displayed as the first row, +contains the set of all distinct non-null values found in +column colH. They come +by default in their order of appearance in the query results, +or in ascending order if a plus (+) sign precedes +colH, +or in descending order if it's a minus (-) sign. + +Also, they can optionally be sorted by another column, if +colV +(respectively colH) is +immediately followed by a colon and a reference to another column +scolV +(respectively scolH). + + + +Inside the crosstab grid, +given a query output with N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +the contents of a cell located at the intersection +(x,y) is determined by these rules: + + + + if there is no corresponding row in the query results such that the + value for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other + columns or the columns listed in + colG1[,colG2...] + are displayed in the cell, separated between each other by + a space character if needed. + + If N=2, the letter X is displayed + in the cell as if a virtual third column contained that character. + + + + + + if there are
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi 2015-12-23 21:36 GMT+01:00 Daniel Verite: >Hi, > > Here's an updated patch that replaces sorted arrays by AVL binary trees > when gathering distinct values for the columns involved in the pivot. > The change is essential for large resultsets. For instance, > it allows to process a query like this (10 million rows x 10 columns): > > select x,(random()*10)::int, (random()*1000)::int from > generate_series(1,1000) as x > \crosstabview > > which takes about 30 seconds to run and display on my machine with the > attached patch. That puts it seemingly in the same ballpark than > the equivalent test with the server-side crosstab(). > > With the previous iterations of the patch, this test would never end, > even with much smaller sets, as the execution time of the 1st step > grew exponentially with the number of distinct keys. > The exponential effect starts to be felt at about 10k values on my low-end > CPU, > and from there quickly becomes problematic. > > As a client-side display feature, processing millions of rows like in > the query above does not necessarily make sense, it's pushing the > envelope, but stalling way below 100k rows felt lame, so I'm happy to get > rid of that limitation. > > However, there is another one. The above example does not need or request > an additional sort step, but if it did, sorting more than 65535 entries in > the vertical header would error out, because values are shipped as > parameters to PQexecParams(), which only accepts that much. > To avoid the problem, when the rows in the output "grid" exceed 2^16 and > they need to be sorted, the user must let the sort being driven by ORDER > BY > beforehand in the query, knowing that the pivot will keep the original > ordering intact in the vertical header. > > I'm still thinking about extending this based on Pavel's diff for the > "label" column, so that > \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH] > would mean to use colV/H as grid headers but sort them according > to colSortV/H. > I prefer that syntax over adding more parameters, and also I'd like > to have it work in both V and H directions. > > Aside from the AVL trees, there are a few other minor changes in that > patch: > - move non-exportable structs from the .h to the .c > - move code in common.c to respect alphabetical ordering > - if vertical sort is requested, add explicit check against more than 65535 > params instead of letting the sort query fail > - report all failure cases of the sort query > - rename sortColumns to serverSort and use less the term "columns" in > comments and variables. > > I checked this version and it is looking well. * all regress tests passed * patch is clean, well documented, well formatted * no objection related to code * current limits 65K * 1600 is good enough, I don't see it as limiting I am looking for next version Regards Pavel > > Best regards, > -- > Daniel Vérité > PostgreSQL-powered mailer: http://www.manitou-mail.org > Twitter: @DanielVerite > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
2015-12-23 21:36 GMT+01:00 Daniel Verite: >Hi, > > Here's an updated patch that replaces sorted arrays by AVL binary trees > when gathering distinct values for the columns involved in the pivot. > The change is essential for large resultsets. For instance, > it allows to process a query like this (10 million rows x 10 columns): > > select x,(random()*10)::int, (random()*1000)::int from > generate_series(1,1000) as x > \crosstabview > > which takes about 30 seconds to run and display on my machine with the > attached patch. That puts it seemingly in the same ballpark than > the equivalent test with the server-side crosstab(). > > With the previous iterations of the patch, this test would never end, > even with much smaller sets, as the execution time of the 1st step > grew exponentially with the number of distinct keys. > The exponential effect starts to be felt at about 10k values on my low-end > CPU, > and from there quickly becomes problematic. > > As a client-side display feature, processing millions of rows like in > the query above does not necessarily make sense, it's pushing the > envelope, but stalling way below 100k rows felt lame, so I'm happy to get > rid of that limitation. > > However, there is another one. The above example does not need or request > an additional sort step, but if it did, sorting more than 65535 entries in > the vertical header would error out, because values are shipped as > parameters to PQexecParams(), which only accepts that much. > To avoid the problem, when the rows in the output "grid" exceed 2^16 and > they need to be sorted, the user must let the sort being driven by ORDER > BY > beforehand in the query, knowing that the pivot will keep the original > ordering intact in the vertical header. > > I'm still thinking about extending this based on Pavel's diff for the > "label" column, so that > \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH] > would mean to use colV/H as grid headers but sort them according > to colSortV/H. > I prefer that syntax over adding more parameters, and also I'd like > to have it work in both V and H directions. > This syntax is good - simple, readable Pavel > > Aside from the AVL trees, there are a few other minor changes in that > patch: > - move non-exportable structs from the .h to the .c > - move code in common.c to respect alphabetical ordering > - if vertical sort is requested, add explicit check against more than 65535 > params instead of letting the sort query fail > - report all failure cases of the sort query > - rename sortColumns to serverSort and use less the term "columns" in > comments and variables. > > > Best regards, > -- > Daniel Vérité > PostgreSQL-powered mailer: http://www.manitou-mail.org > Twitter: @DanielVerite > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Re: [HACKERS] [patch] Proposal for \crosstabview in psql
Hi, Here's an updated patch that replaces sorted arrays by AVL binary trees when gathering distinct values for the columns involved in the pivot. The change is essential for large resultsets. For instance, it allows to process a query like this (10 million rows x 10 columns): select x,(random()*10)::int, (random()*1000)::int from generate_series(1,1000) as x \crosstabview which takes about 30 seconds to run and display on my machine with the attached patch. That puts it seemingly in the same ballpark than the equivalent test with the server-side crosstab(). With the previous iterations of the patch, this test would never end, even with much smaller sets, as the execution time of the 1st step grew exponentially with the number of distinct keys. The exponential effect starts to be felt at about 10k values on my low-end CPU, and from there quickly becomes problematic. As a client-side display feature, processing millions of rows like in the query above does not necessarily make sense, it's pushing the envelope, but stalling way below 100k rows felt lame, so I'm happy to get rid of that limitation. However, there is another one. The above example does not need or request an additional sort step, but if it did, sorting more than 65535 entries in the vertical header would error out, because values are shipped as parameters to PQexecParams(), which only accepts that much. To avoid the problem, when the rows in the output "grid" exceed 2^16 and they need to be sorted, the user must let the sort being driven by ORDER BY beforehand in the query, knowing that the pivot will keep the original ordering intact in the vertical header. I'm still thinking about extending this based on Pavel's diff for the "label" column, so that \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH] would mean to use colV/H as grid headers but sort them according to colSortV/H. I prefer that syntax over adding more parameters, and also I'd like to have it work in both V and H directions. Aside from the AVL trees, there are a few other minor changes in that patch: - move non-exportable structs from the .h to the .c - move code in common.c to respect alphabetical ordering - if vertical sort is requested, add explicit check against more than 65535 params instead of letting the sort query fail - report all failure cases of the sort query - rename sortColumns to serverSort and use less the term "columns" in comments and variables. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 6d0cb3d..563324d 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -2485,6 +2485,95 @@ lo_import 152801 + +\crosstabview [ [-|+]colV [-|+]colH ] + + +Execute the current query buffer (like \g) and shows the results +inside a crosstab grid. The output column colV +becomes a vertical header and the output column +colH becomes a horizontal header. +The results for the other output columns are projected inside the grid. + + + +colV +and colH can indicate a +column position (starting at 1), or a column name. Normal case folding +and quoting rules apply on column names. By default, +colV is column 1 +and colH is column 2. +A query having only one output column cannot be viewed in crosstab, and +colH must differ from +colV. + + + +The vertical header, displayed as the leftmost column, +contains the set of all distinct values found in +column colV, in the order +of their first appearance in the query results. + + +The horizontal header, displayed as the first row, +contains the set of all distinct non-null values found in +column colH. They come +by default in their order of appearance in the query results, or in ascending +order if a plus (+) sign precedes colH, +or in descending order if it's a minus (-) sign. + + + +The query results being tuples of N columns +(including colV and +colH), +for each distinct value x of +colH +and each distinct value y of +colV, +a cell located at the intersection (x,y) in the grid +has contents determined by these rules: + + + + if there is no corresponding row in the results such that the value + for colH + is x and the value + for colV + is y, the cell is empty. + + + + + + if there is exactly one row such that the value + for colH + is x and the value + for colV + is y, then the N-2 other + columns are displayed in the