Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings
As a follow-up question, I found some of the varchar column types, in which the histogram_bounds are not being surrounded in double quotes ( ) even in the default implementation. Ex : *c_name* column of *Customer* table I also found histogram_bounds in which only some strings are surrounded in double quotes and some are not. Ex : *c_address *column of* Customer *table Why are there such inconsistencies? How is this determined? Thank you. On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote: Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry; * * if (* *strcmp(col_type,varchar) == 0* * )* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k], * * CSTRINGOID,* * -2, * * false,* * 'c'); * * else* * arry = construct_array(stats-stavalues[k], * * stats-numvalues[k],* * stats-statypid[k], * * stats-statyplen[k],* * stats-statypbyval[k], * * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke -- Regards, Ashoke
Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings
Ok, I was able to figure out that when strings contained 'spaces', PostgreSQL appends them with double quotes. On Tue, Jul 8, 2014 at 12:04 PM, Ashoke s.ash...@gmail.com wrote: As a follow-up question, I found some of the varchar column types, in which the histogram_bounds are not being surrounded in double quotes ( ) even in the default implementation. Ex : *c_name* column of *Customer* table I also found histogram_bounds in which only some strings are surrounded in double quotes and some are not. Ex : *c_address *column of* Customer *table Why are there such inconsistencies? How is this determined? Thank you. On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote: Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry; * * if (* *strcmp(col_type,varchar) == 0* * )* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k], * * CSTRINGOID,* * -2, * * false,* * 'c'); * * else* * arry = construct_array(stats-stavalues[k], * * stats-numvalues[k],* * stats-statypid[k], * * stats-statyplen[k],* * stats-statypbyval[k], * * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke -- Regards, Ashoke -- Regards, Ashoke
[HACKERS] Modifying update_attstats of analyze.c for C Strings
Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry;* * if (* *strcmp(col_type,varchar) == 0* *)* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k],* * CSTRINGOID,* * -2,* * false,* * 'c');* * else* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k],* * stats-statypid[k],* * stats-statyplen[k],* * stats-statypbyval[k],* * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */}* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke