Re: [SQL] the use of $$string$$
On 05/11/11 00:12, John Fabiani wrote: I'm using psycopg2. OK - bear in mind, I'm not a Python user. This is what I'm doing from python myvarString = "long string that contains single quotes" cusor.execute("insert into table (pkid, myfield) values (%s, $$%s$$)",(123, myvarString)) When I execute the above I'm seeing: E'long string that contains single quotes' in the field. When I do a "select * from table" I get E'long string that contains single quotes'. OK, so it seems psycopg is quoting your strings for you (as you'd expect). It's presumably turning your query into: ... values (E'123', $$E''$$) So - the $$ quoting is unnecessary here - just use the % placeholders. Incidentally, should it be %s for the numeric argument? myvarString = "long string that without single quotes" cusor.execute("insert into table (pkid, myfield) values (%s, %s)",(123, myvarString)) I get the following: "long string that without single quotes" That seems sensible to me (unless there's a typo in your example). You shouldn't need to quote any of your values in your Python code - it's doing it for you. I'm guessing there are other options beside %s for other data-types (integers,floats,boolean etc). -- Richard Huxton Archonet Ltd -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] How to implement Aggregate Awareness?
> Simply put, when Transparent (to the reporting tool) Aggregate Navigator > recognizes that a query would execute faster using aggregates, it > automatically rewrites the query so that the database hits the smaller > aggregates rather than larger detail tables upon which the small > aggregates are pre-computed. Assuming I understand what you want correctly: Before you could usefully do anything like that, I think you'd need to implement full support for materialized views. Once that's in place, the optimizer could potentially be enhanced to recognise queries against a base table and rewrite it to use a materialized view of the table when it sees that the query only touches data collected by the materialized view. Right now, there isn't really anything for query rewriting like this to /target/ . -- Craig Ringer -- For Aggregate Navigation, materialized views are not required. But, let me give you a short example of how it could work. Let us assume we have a fact table SALES with a single measure SALES.AMOUNT, and one dimension table DATE with attributes: DATE.YEAR, DATE.MONTH, DATE.DAY. Assume we also have an aggregate table AGG_SALES that stores aggregated sales amounts at the month level in the field AGG_SALES.AMOUNT. The corresponding shrunken dimension DATE2MM has a subset of DATE's attributes, that is: DATE2MM.YEAR and DATE2MM.MONTH. Now, assume a reporting tool (oblivious of the existence of both AGG_SALES and DATE2MM) sends this query to PostgreSQL: Query 1. select d.month, sum(s.amount) from sales s, date d where s.date_key = d.date_key group by d.month The Aggregate Navigator recognizes that such query will execute faster if aggregate table is used. So it substitutes the aggregate tables into the original query, so in effect the query is re-written as: Query 2. select d.month, sum(s.amount) from agg_sales s, date2mm d where s.date_key = d.date_key group by d.month On the other hand, if the following query arrives: Query 3. select d.day, sum(s.amount) from sales s, date d where s.date_key = d.date_key group by d.day The Aggregate Navigator will not be able to find DATE2MM.DAY, so no substitution will take place. Of course, there needs to be some kind of meta-data defining the relationship between AGG_SALES and SALES that should help the Aggregate Navigator quickly choose the fastest query. In this case, I imagine, the pseudo code could look like this. CREATE RULE "_RETURN" AS ON SELECT TO SALES DO INSTEAD { LANGUAGE PSEUDOCODE $$ IF all the query fields are found in aggregate tables THEN Execute the query substituting the aggregate table names into the query, i.e.: AGG_SALES for SALES ELSE Execute the query unchanged, i.e.: fall back on the detail tables where all the fields are available. $$ } Transparent Aggregate Navigation is simple in theory, but can it be implemented in PostgreSQL? Thank you all for all your comments and suggestions, --- Olgierd Michalak Soft Computer Consultants, Inc. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] How to implement Aggregate Awareness?
Olgierd, This can be done, but it relies on a very strong metadata component that is not available in PG. In db2eee it's completed by having a construct known as automatic summary tables - similar to materialized views Where in ddl you tell the system how the aggregate and base table are related - I believe by specifying the group by clause that in effect defines the grain of the summary table. Given the metadata that these are the summary tables that exist for a base table and the grain of each, the optimizer rewrite rules can rewrite the query to the highest level aggregate that can satisfy the query. The details of how it does this are proprietary although I expect if you dug enough you might find white paper that details a bit. The rewrite rules of PG gives you a great start at building something, but you'll need more than what's available. Microstrategy, cognos, and Business objects have implemented aggregate awareness on generic rdbms but you'll need to create and manage the metadata thru the tool. Best wishes on solving this. Would be great if Ansi sql came up with a standard approach for doing this. Doug -Original Message- From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Olgierd Michalak Sent: Monday, November 07, 2011 1:25 PM To: Craig Ringer Cc: pgsql-sql@postgresql.org Subject: Re: [SQL] How to implement Aggregate Awareness? > Simply put, when Transparent (to the reporting tool) Aggregate Navigator > recognizes that a query would execute faster using aggregates, it > automatically rewrites the query so that the database hits the smaller > aggregates rather than larger detail tables upon which the small > aggregates are pre-computed. Assuming I understand what you want correctly: Before you could usefully do anything like that, I think you'd need to implement full support for materialized views. Once that's in place, the optimizer could potentially be enhanced to recognise queries against a base table and rewrite it to use a materialized view of the table when it sees that the query only touches data collected by the materialized view. Right now, there isn't really anything for query rewriting like this to /target/ . -- Craig Ringer -- For Aggregate Navigation, materialized views are not required. But, let me give you a short example of how it could work. Let us assume we have a fact table SALES with a single measure SALES.AMOUNT, and one dimension table DATE with attributes: DATE.YEAR, DATE.MONTH, DATE.DAY. Assume we also have an aggregate table AGG_SALES that stores aggregated sales amounts at the month level in the field AGG_SALES.AMOUNT. The corresponding shrunken dimension DATE2MM has a subset of DATE's attributes, that is: DATE2MM.YEAR and DATE2MM.MONTH. Now, assume a reporting tool (oblivious of the existence of both AGG_SALES and DATE2MM) sends this query to PostgreSQL: Query 1. select d.month, sum(s.amount) from sales s, date d where s.date_key = d.date_key group by d.month The Aggregate Navigator recognizes that such query will execute faster if aggregate table is used. So it substitutes the aggregate tables into the original query, so in effect the query is re-written as: Query 2. select d.month, sum(s.amount) from agg_sales s, date2mm d where s.date_key = d.date_key group by d.month On the other hand, if the following query arrives: Query 3. select d.day, sum(s.amount) from sales s, date d where s.date_key = d.date_key group by d.day The Aggregate Navigator will not be able to find DATE2MM.DAY, so no substitution will take place. Of course, there needs to be some kind of meta-data defining the relationship between AGG_SALES and SALES that should help the Aggregate Navigator quickly choose the fastest query. In this case, I imagine, the pseudo code could look like this. CREATE RULE "_RETURN" AS ON SELECT TO SALES DO INSTEAD { LANGUAGE PSEUDOCODE $$ IF all the query fields are found in aggregate tables THEN Execute the query substituting the aggregate table names into the query, i.e.: AGG_SALES for SALES ELSE Execute the query unchanged, i.e.: fall back on the detail tables where all the fields are available. $$ } Transparent Aggregate Navigation is simple in theory, but can it be implemented in PostgreSQL? Thank you all for all your comments and suggestions, --- Olgierd Michalak Soft Computer Consultants, Inc. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] GROUP and ORDER BY
Hello, I would like to GROUP the result by one column and ORDER it by another: SELECT no, name, similarity(name, 'Tooneyvara') AS s FROM vtown WHEREsimilarity(name, 'Tooneyvara') > 0.4 ORDER BY s DESC Result: 1787"Toomyvara" 0.5 1787"Toomevara" 0.4 1188"Toonybara" 0.4 Desired result: 1787"Toomyvara" 0.5 1188"Toonybara" 0.4 Gets rid of the duplicate "no" keeping the spelling with the greater similarity and presents the remaining result ordered by similarity. My solution: SELECT * FROM ( SELECT DISTINCT ON (no) no, name, similarity(name, 'Tooneyvara') AS sim FROM vtown WHERE similarity(name, 'Tooneyvara') > 0.4 ORDER BY no, sim DESC ) AS x ORDER BY sim Is that the best way to achieve this result? -- Best Regards, Tarlika Elisabeth Schmitz -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] GROUP and ORDER BY
Unless I overlooked something here, does this work ? SELECT no, name, MAX(similarity(name, 'Tooneyvara')) AS sim FROM vtown WHERE similarity(name, 'Tooneyvara') > 0.4 GROUP BY no, name ORDER BY sim DESC -- Robins Tharakan On 11/08/2011 02:50 AM, Tarlika Elisabeth Schmitz wrote: Hello, I would like to GROUP the result by one column and ORDER it by another: SELECT no, name, similarity(name, 'Tooneyvara') AS s FROM vtown WHEREsimilarity(name, 'Tooneyvara')> 0.4 ORDER BY s DESC Result: 1787"Toomyvara" 0.5 1787"Toomevara" 0.4 1188"Toonybara" 0.4 Desired result: 1787"Toomyvara" 0.5 1188"Toonybara" 0.4 Gets rid of the duplicate "no" keeping the spelling with the greater similarity and presents the remaining result ordered by similarity. My solution: SELECT * FROM ( SELECT DISTINCT ON (no) no, name, similarity(name, 'Tooneyvara') AS sim FROM vtown WHERE similarity(name, 'Tooneyvara')> 0.4 ORDER BY no, sim DESC ) AS x ORDER BY sim Is that the best way to achieve this result? -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] the use of $$string$$
On mån, 2011-11-07 at 08:44 +, Richard Huxton wrote: > > myvarString = "long string that contains single quotes" > > cusor.execute("insert into table (pkid, myfield) values (%s, $$%s > $$)",(123, > > myvarString)) > > > > When I execute the above I'm seeing: > > E'long string that contains single quotes' in the field. When I do > a "select > > * from table" I get E'long string that contains single quotes'. > > OK, so it seems psycopg is quoting your strings for you (as you'd > expect). It's presumably turning your query into: > ... values (E'123', $$E''$$) > So - the $$ quoting is unnecessary here - just use the % placeholders. > > Incidentally, should it be %s for the numeric argument? Yes. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql