Re: [GENERAL] strange sql behavior

2016-02-01 Thread Yu Nie
Hi Melvin,

Many thanks for your help.  I just deleted the small table,along with
another large unused table (with about 1.2 billion rows) in the same
database,  It frees up about 100 Gb space on the disk.  I am reloading the
small table again and  it will take a few more hours to complete. Once it
is done,  I will let you know if the problem persists.  I will then report
the info at your suggestion.   By the way, I also did not disk
defragmentation  after deleting the tables, not sure it helps...

Best, Marco

On Mon, Feb 1, 2016 at 8:50 PM, Melvin Davidson 
wrote:

> OK, one more thing to try.
>
> Please send the output of the following query to a file and attach so I
> can review.
>
> SELECT c.relname as idxname,
> i.indisclustered,
> i.indisvalid,
>i.indcollation
> FROM pg_index i
>   JOIN pg_class c ON i.indexrelid = c.oid
>  WHERE i.indrelid IN (SELECT oid
>FROM pg_class
> WHERE relkind = 'r'
>   AND relname = 'data2011_01'
> OR relname = 'data2013_01w'
>)
> ORDER BY 1;
>
>
> On Mon, Feb 1, 2016 at 5:51 PM, Yu Nie  wrote:
>
>> Yeah, it is really puzzling. :(
>>
>> No, I have never attempted to cluster the large table - as I mentioned
>> before the two tables were created almost exactly the same way (through a
>> c++ api), except with different data files. I only use theme for query, and
>> had never done any update/insert/delete since the creation.
>>
>>
>>
>> On Mon, Feb 1, 2016 at 4:41 PM, Melvin Davidson 
>> wrote:
>>
>>> At this point I can see no possible explanation why this would occur.
>>>
>>> Just a thought though, is it possible that data2011_01 was clustered on
>>> the index at some point but data2013_01w has not been clustered?
>>> If you cluster data2013_01w on the index, does the performance change?
>>>
>>> On Mon, Feb 1, 2016 at 4:03 PM, Yu Nie  wrote:
>>>
 Melvin,

 Please see attached for the requests results. I ran two queries (each
 with a different taxiid that is next to each other) for each table.  Note
 that for the large table one is much faster than the other because the
 shared cache was used for the second query.  This does not work however for
 the small table.

 Many thanks for your willingness to help!

 Best, Marco

 On Mon, Feb 1, 2016 at 2:25 PM, melvin6925 
 wrote:

> Fine. Please rerun both explains and my queries and send ALL output to
> a file (F8 in PGADMIN Sql). Then attach the file.
>
>
>
> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>  Original message 
> From: Yu Nie 
> Date: 2/1/2016 15:17 (GMT-05:00)
> To: melvin6925 
> Subject: Re: [GENERAL] strange sql behavior
>
> Yes, absolutely.
>
> On Mon, Feb 1, 2016 at 2:12 PM, melvin6925 
> wrote:
>
>> And were _your_ queries run on the same day at the same time within a
>> few seconds of each other?
>>
>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>  Original message 
>> From: Yu Nie 
>> Date: 2/1/2016 15:01 (GMT-05:00)
>> To: melvin6925 
>> Cc: Bill Moran ,
>> pgsql-general@postgresql.org
>> Subject: Re: [GENERAL] strange sql behavior
>>
>> Thanks, Bill and Melvin!
>>
>> Just some quick note/answers before I absorb all the information
>> provided by Bill.
>>
>> 1. I don't expect many users running queries against the tables,
>> especially for the small table - since I just created it this morning, 
>> and
>> nobody know about it except myself.
>>
>> 2. The setting in the config:
>>
>>
>> shared_buffers = 512MB# min 128kB
>> work_mem = 128MB# min 64kB
>>
>> 3. I am running a Windows 7 with 24 GB RAM.  and my postgresal is
>> 9.4.
>>
>> 4.  here is the query I ran:
>>
>> SELECT n.nspname,
>>s.relname,
>>c.reltuples::bigint,
>>n_tup_ins,
>>n_tup_upd,
>>n_tup_del,
>>date_trunc('second', last_vacuum) as last_vacuum,
>>date_trunc('second', last_autovacuum) as last_autovacuum,
>>date_trunc('second', last_analyze) as last_analyze,
>>date_trunc('second', last_autoanalyze) as last_autoanalyze
>>,
>>round( current_setting('autovacuum_vacuum_threshold')::integer
>> + current_setting('autovacuum_vacuum_scale_factor')::numeric * 
>> C.reltuples)
>> AS av_threshold
>>   FROM pg_stat_all_tables s
>>   JOIN pg_class c ON c.oid = s.relid
>>   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>  WHERE s.relname NOT LIKE 'pg_%'
>>AND s.relname NOT LIKE 'sql_%'
>>AND s.rel

Re: [GENERAL] pg_dump - ERROR - PostgreSQL 9.2

2016-02-01 Thread drum.lu...@gmail.com
[SOLVED]

Just to let you guys know.. I've solved this issue changing
the max_standby_streaming_delay on my slave to "-1".

Cheers



Lucas Possamai

- kinghost.co.nz

- DigitalOcean 

On 2 February 2016 at 13:21, drum.lu...@gmail.com 
wrote:

> Hi all,
>
> *masterdb01 -> slave01 -> slave02 -> slave03*
>
> *testmaster01 (Full copy from masterdb01)*
>
> I'm trying to refresh my main DB, by running this command on my test
> server:
>
> ssh postgres@slave02 "/usr/pgsql-9.2/bin/pg_dump
>> --exclude-table-data='junk.*' --format=custom master_db_name" |
>> /usr/pgsql-9.2/bin/pg_restore --dbname=master_db_name_temp --exit-on-error
>
>
> But, I got the following error after few seconds:
>
> pg_dump: Dumping the contents of table "invoices" failed: PQgetResult()
>> failed.
>> pg_dump: Error message from server: ERROR:  canceling statement due to
>> conflict with recovery
>> DETAIL:  User was holding a relation lock for too long.
>> pg_dump: The command was: COPY dm.invoices (invoice_id, format_version,
>> ts_issue, ts_ack, customer_id, code, tag, account_data, customer_data,
>> invoice_data, invoice_items_data) TO stdout;
>
>
> If I run the pg_dump from the MASTER DB, it runs ok. But as the DB is 2
> TB, I just can't let it coping from Production. It has to be from a slave
> server (I've also tried to copy from another slave. I get the same error).
>
> What am I doing wrong? How can I solve the problem?
>
>
> * I've sent the same e-mail to pgsql-admin
>
> Thanks;
> Lucas
>


[GENERAL] pg_dump - ERROR - PostgreSQL 9.2

2016-02-01 Thread drum.lu...@gmail.com
Hi all,

*masterdb01 -> slave01 -> slave02 -> slave03*

*testmaster01 (Full copy from masterdb01)*

I'm trying to refresh my main DB, by running this command on my test server:

ssh postgres@slave02 "/usr/pgsql-9.2/bin/pg_dump
> --exclude-table-data='junk.*' --format=custom master_db_name" |
> /usr/pgsql-9.2/bin/pg_restore --dbname=master_db_name_temp --exit-on-error


But, I got the following error after few seconds:

pg_dump: Dumping the contents of table "invoices" failed: PQgetResult()
> failed.
> pg_dump: Error message from server: ERROR:  canceling statement due to
> conflict with recovery
> DETAIL:  User was holding a relation lock for too long.
> pg_dump: The command was: COPY dm.invoices (invoice_id, format_version,
> ts_issue, ts_ack, customer_id, code, tag, account_data, customer_data,
> invoice_data, invoice_items_data) TO stdout;


If I run the pg_dump from the MASTER DB, it runs ok. But as the DB is 2 TB,
I just can't let it coping from Production. It has to be from a slave
server (I've also tried to copy from another slave. I get the same error).

What am I doing wrong? How can I solve the problem?


* I've sent the same e-mail to pgsql-admin

Thanks;
Lucas


Re: [GENERAL] Is it possible to select index values ?

2016-02-01 Thread David G. Johnston
On Mon, Feb 1, 2016 at 4:29 PM, Jonathan Vanasco  wrote:

> Is it possible to select index values ?
>
> I haven't found any documentation that says "No", but I haven't found
> anything that says "Yes" either.
>
> The reason - I have a few function indexes that are working as partial
> indexes.  I'd like to run some analytics on them (to determine uniqueness
> of values, decide if i should change the function, etc).  It would be
> easier if I could somehow access the index contents than re-create the
> index data into a temporary table.
>

​I'm not optimistic that such a capability exists, though...​

If on ​9.2 or more recent you might have a chance...I still haven't gotten
the hang of the INDEX ONLY planner optimization but in theory when one can
be used it can also supply the value for the function's column.  The
probably is that it is an optimization and thus not guaranteed and you also
do not get, that I am aware, a mix - either all column data come from the
index or all column data comes from the heap.  In the later case there is
nothing to pull from the heap and so a new value must be derived.

You can query the statistics portion of the database to get some basic
statistics of the form mentioned.

David J.


[GENERAL] Is it possible to select index values ?

2016-02-01 Thread Jonathan Vanasco
Is it possible to select index values ?

I haven't found any documentation that says "No", but I haven't found anything 
that says "Yes" either.

The reason - I have a few function indexes that are working as partial indexes. 
 I'd like to run some analytics on them (to determine uniqueness of values, 
decide if i should change the function, etc).  It would be easier if I could 
somehow access the index contents than re-create the index data into a 
temporary table.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] strange sql behavior

2016-02-01 Thread Melvin Davidson
At this point I can see no possible explanation why this would occur.

Just a thought though, is it possible that data2011_01 was clustered on the
index at some point but data2013_01w has not been clustered?
If you cluster data2013_01w on the index, does the performance change?

On Mon, Feb 1, 2016 at 4:03 PM, Yu Nie  wrote:

> Melvin,
>
> Please see attached for the requests results. I ran two queries (each with
> a different taxiid that is next to each other) for each table.  Note that
> for the large table one is much faster than the other because the shared
> cache was used for the second query.  This does not work however for the
> small table.
>
> Many thanks for your willingness to help!
>
> Best, Marco
>
> On Mon, Feb 1, 2016 at 2:25 PM, melvin6925  wrote:
>
>> Fine. Please rerun both explains and my queries and send ALL output to a
>> file (F8 in PGADMIN Sql). Then attach the file.
>>
>>
>>
>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>  Original message 
>> From: Yu Nie 
>> Date: 2/1/2016 15:17 (GMT-05:00)
>> To: melvin6925 
>> Subject: Re: [GENERAL] strange sql behavior
>>
>> Yes, absolutely.
>>
>> On Mon, Feb 1, 2016 at 2:12 PM, melvin6925  wrote:
>>
>>> And were _your_ queries run on the same day at the same time within a
>>> few seconds of each other?
>>>
>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>  Original message 
>>> From: Yu Nie 
>>> Date: 2/1/2016 15:01 (GMT-05:00)
>>> To: melvin6925 
>>> Cc: Bill Moran , pgsql-general@postgresql.org
>>> Subject: Re: [GENERAL] strange sql behavior
>>>
>>> Thanks, Bill and Melvin!
>>>
>>> Just some quick note/answers before I absorb all the information
>>> provided by Bill.
>>>
>>> 1. I don't expect many users running queries against the tables,
>>> especially for the small table - since I just created it this morning, and
>>> nobody know about it except myself.
>>>
>>> 2. The setting in the config:
>>>
>>>
>>> shared_buffers = 512MB# min 128kB
>>> work_mem = 128MB# min 64kB
>>>
>>> 3. I am running a Windows 7 with 24 GB RAM.  and my postgresal is 9.4.
>>>
>>> 4.  here is the query I ran:
>>>
>>> SELECT n.nspname,
>>>s.relname,
>>>c.reltuples::bigint,
>>>n_tup_ins,
>>>n_tup_upd,
>>>n_tup_del,
>>>date_trunc('second', last_vacuum) as last_vacuum,
>>>date_trunc('second', last_autovacuum) as last_autovacuum,
>>>date_trunc('second', last_analyze) as last_analyze,
>>>date_trunc('second', last_autoanalyze) as last_autoanalyze
>>>,
>>>round( current_setting('autovacuum_vacuum_threshold')::integer +
>>> current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
>>> AS av_threshold
>>>   FROM pg_stat_all_tables s
>>>   JOIN pg_class c ON c.oid = s.relid
>>>   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>>  WHERE s.relname NOT LIKE 'pg_%'
>>>AND s.relname NOT LIKE 'sql_%'
>>>AND s.relname IN  ('data2013_01w', 'data2011_01')
>>>  ORDER by 1, 2;
>>>
>>> I copied the result from PGAdmin directly, here it is again.:
>>>
>>> public;"data2011_01";784236864;784236885;0;0;"";"";"";"2016-01-19
>>> 17:31:08-06";156847423
>>> public;"data2013_01w";300786432;300786444;0;0;"";"";"2016-02-01
>>> 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>>
>>> On Mon, Feb 1, 2016 at 1:45 PM, melvin6925  wrote:
>>>
 Thanks Bill.
 Also, it's very important to include the headers with the queries!

 Marco,
 There is no top secret information that is requested, so please do not
 edit the output.



 Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
  Original message 
 From: Bill Moran 
 Date: 2/1/2016 14:41 (GMT-05:00)
 To: Yu Nie 
 Cc: Melvin Davidson ,
 pgsql-general@postgresql.org
 Subject: Re: [GENERAL] strange sql behavior


 Came a little late to the thread, see many comments inline below:

 On Mon, 1 Feb 2016 13:16:13 -0600
 Yu Nie  wrote:

 > Thanks  a lot for your reply. I ran the query you suggested and here
 are
 > the results
 >
 > Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
 > 17:31:08-06";156847423
 > Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
 > 08:57:24-06";"2016-02-01 04:01:04-06";60157336

 You didn't do Melvin's query correctly. He specified 11 columns to
 select, but you only show 10. Since you don't show the query you
 actually ran, we don't know which of the numeric columns is missing.

 More information inline below:

 >
 > On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson >>> >
 > wrote:
 >
 > > One thing to look at is the last time both tables were
 vacuumed/analyzed.
 > >
 > > SELECT n.nspname,
 > >s.relname,
 > >c.reltuples::bigint,
 > >n_tup_ins,
 > 

Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Adrian Klaver

On 02/01/2016 01:23 PM, David G. Johnston wrote:

On Mon, Feb 1, 2016 at 2:11 PM, Adrian Klaver mailto:adrian.kla...@aklaver.com>>wrote:

On 02/01/2016 12:52 PM, Dane Foster wrote:

On Mon, Feb 1, 2016 at 3:41 PM, Adrian Klaver
mailto:adrian.kla...@aklaver.com>
>> wrote:



 As an example of where this leads see:

http://www.postgresql.org/message-id/7224.1452275...@sss.pgh.pa.us

​Thanks for the heads up. The good news is all machine access to the
data will be via functions and views so I can inline the
constraint in
the right places. In other news, this sucks! I have no idea what it


I could see moving your constraint into a per row trigger.


You'd need to basically replicate the current FK constraint setup but
with custom queries...you need the insert/update trigger on the main
table and then a insert/update/delete trigger on the referenced table to
ensure that actions just rejected if the relevant detail on the main
table isn't changed.  Then decide whether you need something like "ON
UPDATE/DELETE CASCADE" instead of the default 'reject' behavior.

I take it you would need to ensure that these triggers are disabled
during dump/restore but am not certain on that point.


Well this brings up another part to Danes post(that contained the 
function definition):


"Unfortunately the "type" definition can't be expressed as a primary key 
so I can't use foreign keys to enforce consistency."


Not sure what exactly is meant by "type", though I suspect it is this:
"SELECT type FROM discount_codes WHERE code ..."

FYI, I know type is non-reserved word, but I would avoid using it as a 
column name. I went down that path and got myself confused in a hurry:)


In any case it should be pointed out that FKs do not necessarily have to 
point to PKs:


http://www.postgresql.org/docs/9.5/interactive/sql-createtable.html

"The referenced columns must be the columns of a non-deferrable unique 
or primary key constraint in the referenced table"




David J.




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread David G. Johnston
On Mon, Feb 1, 2016 at 2:11 PM, Adrian Klaver 
wrote:

> On 02/01/2016 12:52 PM, Dane Foster wrote:
>
>> On Mon, Feb 1, 2016 at 3:41 PM, Adrian Klaver > > wrote:
>>
>>
>
>> As an example of where this leads see:
>>
>> http://www.postgresql.org/message-id/7224.1452275...@sss.pgh.pa.us
>>
>> ​Thanks for the heads up. The good news is all machine access to the
>> data will be via functions and views so I can inline the constraint in
>> the right places. In other news, this sucks! I have no idea what it
>>
>
> I could see moving your constraint into a per row trigger.


You'd need to basically replicate the current FK constraint setup but with
custom queries...you need the insert/update trigger on the main table and
then a insert/update/delete trigger on the referenced table to ensure that
actions just rejected if the relevant detail on the main table isn't
changed.  Then decide whether you need something like "ON UPDATE/DELETE
CASCADE" instead of the default 'reject' behavior.

I take it you would need to ensure that these triggers are disabled during
dump/restore but am not certain on that point.

David J.


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Adrian Klaver

On 02/01/2016 12:52 PM, Dane Foster wrote:

On Mon, Feb 1, 2016 at 3:41 PM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote:





As an example of where this leads see:

http://www.postgresql.org/message-id/7224.1452275...@sss.pgh.pa.us

​Thanks for the heads up. The good news is all machine access to the
data will be via functions and views so I can inline the constraint in
the right places. In other news, this sucks! I have no idea what it


I could see moving your constraint into a per row trigger.


would take to implement a more flexible constraint mechanism where these
types of dependencies can be expressed declaratively but it would be
great if someone w/ the know-how did. As is evident by the fact that I
wasn't the only one to not realize the rabbit hole I was heading down,
it would be a useful feature.
​
​As always thanks for setting me straight,

Dane




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR error trying to replay a invalid statement

2016-02-01 Thread cchee-ob
I noticed that the BDR replication continually trying to replay a ddl
statement that has a syntax error.  Is there anything that can be done to
skip this statement or do I need to rebuild the replicated node?  Here's
what I see in the logs:

t=2016-02-01 13:02:27 PST d= h= p=21795 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-02-01 13:02:27 PST d= h= p=21795 a=INFO:  0: starting up
replication from 1 at 125/360AF118
t=2016-02-01 13:02:27 PST d= h= p=21795 a=LOCATION:  bdr_apply_main,
bdr_apply.c:2550
t=2016-02-01 13:02:27 PST d= h= p=21795 a=DEBUG:  0: bdr_apply: BEGIN
origin(source, orig_lsn, timestamp): 125/360AF1B8, 2016-01-28
19:34:50.180915-08
t=2016-02-01 13:02:27 PST d= h= p=21795 a=LOCATION:  process_remote_begin,
bdr_apply.c:198
t=2016-02-01 13:02:27 PST d= h= p=21795 a=ERROR:  42601: syntax error at or
near "ON" at character 8
t=2016-02-01 13:02:27 PST d= h= p=21795 a=CONTEXT:  during DDL replay of ddl
statement: GRANT  ON TABLE table1 TO user2 WITH GRANT OPTION

Thanks in advance!




--
View this message in context: 
http://postgresql.nabble.com/BDR-error-trying-to-replay-a-invalid-statement-tp5885230.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] strange sql behavior

2016-02-01 Thread Yu Nie
Melvin,

Please see attached for the requests results. I ran two queries (each with
a different taxiid that is next to each other) for each table.  Note that
for the large table one is much faster than the other because the shared
cache was used for the second query.  This does not work however for the
small table.

Many thanks for your willingness to help!

Best, Marco

On Mon, Feb 1, 2016 at 2:25 PM, melvin6925  wrote:

> Fine. Please rerun both explains and my queries and send ALL output to a
> file (F8 in PGADMIN Sql). Then attach the file.
>
>
>
> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>  Original message 
> From: Yu Nie 
> Date: 2/1/2016 15:17 (GMT-05:00)
> To: melvin6925 
> Subject: Re: [GENERAL] strange sql behavior
>
> Yes, absolutely.
>
> On Mon, Feb 1, 2016 at 2:12 PM, melvin6925  wrote:
>
>> And were _your_ queries run on the same day at the same time within a few
>> seconds of each other?
>>
>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>  Original message 
>> From: Yu Nie 
>> Date: 2/1/2016 15:01 (GMT-05:00)
>> To: melvin6925 
>> Cc: Bill Moran , pgsql-general@postgresql.org
>> Subject: Re: [GENERAL] strange sql behavior
>>
>> Thanks, Bill and Melvin!
>>
>> Just some quick note/answers before I absorb all the information provided
>> by Bill.
>>
>> 1. I don't expect many users running queries against the tables,
>> especially for the small table - since I just created it this morning, and
>> nobody know about it except myself.
>>
>> 2. The setting in the config:
>>
>>
>> shared_buffers = 512MB# min 128kB
>> work_mem = 128MB# min 64kB
>>
>> 3. I am running a Windows 7 with 24 GB RAM.  and my postgresal is 9.4.
>>
>> 4.  here is the query I ran:
>>
>> SELECT n.nspname,
>>s.relname,
>>c.reltuples::bigint,
>>n_tup_ins,
>>n_tup_upd,
>>n_tup_del,
>>date_trunc('second', last_vacuum) as last_vacuum,
>>date_trunc('second', last_autovacuum) as last_autovacuum,
>>date_trunc('second', last_analyze) as last_analyze,
>>date_trunc('second', last_autoanalyze) as last_autoanalyze
>>,
>>round( current_setting('autovacuum_vacuum_threshold')::integer +
>> current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
>> AS av_threshold
>>   FROM pg_stat_all_tables s
>>   JOIN pg_class c ON c.oid = s.relid
>>   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>  WHERE s.relname NOT LIKE 'pg_%'
>>AND s.relname NOT LIKE 'sql_%'
>>AND s.relname IN  ('data2013_01w', 'data2011_01')
>>  ORDER by 1, 2;
>>
>> I copied the result from PGAdmin directly, here it is again.:
>>
>> public;"data2011_01";784236864;784236885;0;0;"";"";"";"2016-01-19
>> 17:31:08-06";156847423
>> public;"data2013_01w";300786432;300786444;0;0;"";"";"2016-02-01
>> 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>
>> On Mon, Feb 1, 2016 at 1:45 PM, melvin6925  wrote:
>>
>>> Thanks Bill.
>>> Also, it's very important to include the headers with the queries!
>>>
>>> Marco,
>>> There is no top secret information that is requested, so please do not
>>> edit the output.
>>>
>>>
>>>
>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>  Original message 
>>> From: Bill Moran 
>>> Date: 2/1/2016 14:41 (GMT-05:00)
>>> To: Yu Nie 
>>> Cc: Melvin Davidson , pgsql-general@postgresql.org
>>> Subject: Re: [GENERAL] strange sql behavior
>>>
>>>
>>> Came a little late to the thread, see many comments inline below:
>>>
>>> On Mon, 1 Feb 2016 13:16:13 -0600
>>> Yu Nie  wrote:
>>>
>>> > Thanks  a lot for your reply. I ran the query you suggested and here
>>> are
>>> > the results
>>> >
>>> > Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
>>> > 17:31:08-06";156847423
>>> > Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
>>> > 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>>
>>> You didn't do Melvin's query correctly. He specified 11 columns to
>>> select, but you only show 10. Since you don't show the query you
>>> actually ran, we don't know which of the numeric columns is missing.
>>>
>>> More information inline below:
>>>
>>> >
>>> > On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson 
>>> > wrote:
>>> >
>>> > > One thing to look at is the last time both tables were
>>> vacuumed/analyzed.
>>> > >
>>> > > SELECT n.nspname,
>>> > >s.relname,
>>> > >c.reltuples::bigint,
>>> > >n_tup_ins,
>>> > >n_tup_upd,
>>> > >n_tup_del,
>>> > >date_trunc('second', last_vacuum) as last_vacuum,
>>> > >date_trunc('second', last_autovacuum) as last_autovacuum,
>>> > >date_trunc('second', last_analyze) as last_analyze,
>>> > >date_trunc('second', last_autoanalyze) as last_autoanalyze
>>> > >,
>>> > >round(
>>> current_setting('autovacuum_vacuum_threshold')::integer +
>>> > > current_setting('autovacuum_vacuum_scale_factor')::num

Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Dane Foster
On Mon, Feb 1, 2016 at 3:41 PM, Adrian Klaver 
wrote:

> On 02/01/2016 12:36 PM, David G. Johnston wrote:
>
>> On Mon, Feb 1, 2016 at 1:30 PM, Dane Foster > >wrote:
>>
>>
>> On Mon, Feb 1, 2016 at 2:56 PM, David G. Johnston
>> mailto:david.g.johns...@gmail.com>>
>> wrote:
>>
>> On Mon, Feb 1, 2016 at 12:41 PM, Adrian Klaver
>> mailto:adrian.kla...@aklaver.com
>> >>wrote:
>>
>>
>> On 02/01/2016 11:17 AM, Dane Foster wrote:
>>
>> Hello,
>>
>> I'm discovering that I need to write quite a few
>> functions for use
>> strictly w/ check constraints and I'm wondering if
>> declaring the
>> volatility category for said functions will affect their
>> behavior when
>> invoked by PostgreSQL's check constraint mechanism.
>>
>>
>> ​Adrian's point is spot-on but the important thing to consider
>> in this situation is that check constraints are assumed to be
>> immutable and if you implement a check function that is not you
>> don't get to complain what you see something broken.  The nature
>> and use of an immutable check constraint only has a single
>> dynamic - execute the function using the given values once for
>> every record INSERT or UPDATE.  There is no reason, and I
>> suspect there is no actual, attempt to even look at the
>> volatility category of said function before performing those
>> actions.  It is possible that two records inserted or updated in
>> the same query could make use of the caching possibilities
>> afforded by immutable functions but if so assume it is being
>> done unconditionally.
>>
>> David J.
>>
>> ​Your point about ".. check ​constraints are assumed to be immutable
>> ..", is that in the manual? Because I don't remember reading it in
>> the constraints section, nor in the volatility categories section,
>> nor in the server programming sections. Granted, I haven't read the
>> whole manual yet nor do I have what I've read so far memorized, but
>> I think that little fact would have struck a cord in my gray matter.
>> So if you can point me to the spot in the manual where this is
>> covered I would appreciate it.​
>>
>>
>>
>> ​http://www.postgresql.org/docs/9.5/interactive/sql-createtable.html​
>> ​Second Paragraph​
>>
>> ​"""​
>>   CHECK ( expression ) [ NO INHERIT ]
>> The CHECK clause specifies an expression producing a Boolean result
>> which new or updated rows must satisfy for an insert or update operation
>> to succeed. Expressions evaluating to TRUE or UNKNOWN succeed. Should
>> any row of an insert or update operation produce a FALSE result, an
>> error exception is raised and the insert or update does not alter the
>> database. A check constraint specified as a column constraint should
>> reference that column's value only, while an expression appearing in a
>> table constraint can reference multiple columns.
>>
>> Currently, CHECK expressions cannot contain subqueries nor refer to
>> variables other than columns of the current row. The system column
>> tableoid may be referenced, but not any other system column.
>>
>> A constraint marked with NO INHERIT will not propagate to child tables.
>>
>> When a table has multiple CHECK constraints, they will be tested for
>> each row in alphabetical order by name, after checking NOT NULL
>> constraints. (PostgreSQL versions before 9.5 did not honor any
>> particular firing order for CHECK constraints.)
>> ​"""
>>
>> While you've managed to fool the system by wrapping your query into a
>> function you've violated the documented restrictions and so any breakage
>> is on you - not the system.
>>
>
> As an example of where this leads see:
>
> http://www.postgresql.org/message-id/7224.1452275...@sss.pgh.pa.us
>
> ​Thanks for the heads up. The good news is all machine access to the data
will be via functions and views so I can inline the constraint in the right
places. In other news, this sucks! I have no idea what it would take to
implement a more flexible constraint mechanism where these types of
dependencies can be expressed declaratively but it would be great if
someone w/ the know-how did. As is evident by the fact that I wasn't the
only one to not realize the rabbit hole I was heading down, it would be a
useful feature.
​

​As always thanks for setting me straight,

Dane


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Karsten Hilbert
On Mon, Feb 01, 2016 at 12:41:30PM -0800, Adrian Klaver wrote:

>> While you've managed to fool the system by wrapping your query into a
>> function you've violated the documented restrictions and so any breakage
>> is on you - not the system.
> 
> As an example of where this leads see:
> 
> http://www.postgresql.org/message-id/7224.1452275...@sss.pgh.pa.us

For the record, with the help of this list I have managed to
work myself out of the corner I had painted myself into.

Karsten
-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Dane Foster
On Mon, Feb 1, 2016 at 3:36 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Mon, Feb 1, 2016 at 1:30 PM, Dane Foster  wrote:
>
>>
>> On Mon, Feb 1, 2016 at 2:56 PM, David G. Johnston <
>> david.g.johns...@gmail.com> wrote:
>>
>>> On Mon, Feb 1, 2016 at 12:41 PM, Adrian Klaver <
>>> adrian.kla...@aklaver.com> wrote:
>>>
 On 02/01/2016 11:17 AM, Dane Foster wrote:

> Hello,
>
> I'm discovering that I need to write quite a few functions for use
> strictly w/ check constraints and I'm wondering if declaring the
> volatility category for said functions will affect their behavior when
> invoked by PostgreSQL's check constraint mechanism.
>

>>> ​Adrian's point is spot-on but the important thing to consider in this
>>> situation is that check constraints are assumed to be immutable and if you
>>> implement a check function that is not you don't get to complain what you
>>> see something broken.  The nature and use of an immutable check constraint
>>> only has a single dynamic - execute the function using the given values
>>> once for every record INSERT or UPDATE.  There is no reason, and I suspect
>>> there is no actual, attempt to even look at the volatility category of said
>>> function before performing those actions.  It is possible that two records
>>> inserted or updated in the same query could make use of the caching
>>> possibilities afforded by immutable functions but if so assume it is being
>>> done unconditionally.
>>>
>>> David J.
>>>
>>> ​Your point about ".. check ​constraints are assumed to be immutable
>> ..", is that in the manual? Because I don't remember reading it in the
>> constraints section, nor in the volatility categories section, nor in the
>> server programming sections. Granted, I haven't read the whole manual yet
>> nor do I have what I've read so far memorized, but I think that little fact
>> would have struck a cord in my gray matter. So if you can point me to the
>> spot in the manual where this is covered I would appreciate it.​
>>
>>
>>
> ​http://www.postgresql.org/docs/9.5/interactive/sql-createtable.html​
> ​Second Paragraph​
>
> ​"""​
>  CHECK ( expression ) [ NO INHERIT ]
> The CHECK clause specifies an expression producing a Boolean result which
> new or updated rows must satisfy for an insert or update operation to
> succeed. Expressions evaluating to TRUE or UNKNOWN succeed. Should any row
> of an insert or update operation produce a FALSE result, an error exception
> is raised and the insert or update does not alter the database. A check
> constraint specified as a column constraint should reference that column's
> value only, while an expression appearing in a table constraint can
> reference multiple columns.
>
> Currently, CHECK expressions cannot contain subqueries nor refer to
> variables other than columns of the current row. The system column tableoid
> may be referenced, but not any other system column.
>
> A constraint marked with NO INHERIT will not propagate to child tables.
>
> When a table has multiple CHECK constraints, they will be tested for each
> row in alphabetical order by name, after checking NOT NULL constraints.
> (PostgreSQL versions before 9.5 did not honor any particular firing order
> for CHECK constraints.)
> ​"""
>
> While you've managed to fool the system by wrapping your query into a
> function you've violated the documented restrictions and so any breakage is
> on you - not the system.
>
> Also, consider that at the time you insert a row the check constraint
> passes but then you alter the other table so that, if you tried to insert
> the row again it would fail.  Since check constraints are only evaluated
> upon INSERT/UPDATE of the data on the same table you would have a violation.
>
> So, while the documentation doesn't explicitly say that functions used in
> CHECK must be IMMUTABLE that is what it all boils down to when you put all
> of these things together.
>
> David J.
>
> ​Though I understand the thinking you have applied to conclude that a
CHECK constraint is supposed to be IMMUTABLE I don't necessarily agree w/
it nor has the section you quoted made that expectation clear. Because when
I read it the first time and even again now it is not immediately apparent
that that assumption exists. But if it is true, as in, that is the intent
of the code then it should be made explicit in the documentation.

​Regards,​

​Dane​


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Adrian Klaver

On 02/01/2016 12:36 PM, David G. Johnston wrote:

On Mon, Feb 1, 2016 at 1:30 PM, Dane Foster mailto:studdu...@gmail.com>>wrote:


On Mon, Feb 1, 2016 at 2:56 PM, David G. Johnston
mailto:david.g.johns...@gmail.com>> wrote:

On Mon, Feb 1, 2016 at 12:41 PM, Adrian Klaver
mailto:adrian.kla...@aklaver.com>>wrote:

On 02/01/2016 11:17 AM, Dane Foster wrote:

Hello,

I'm discovering that I need to write quite a few
functions for use
strictly w/ check constraints and I'm wondering if
declaring the
volatility category for said functions will affect their
behavior when
invoked by PostgreSQL's check constraint mechanism.


​Adrian's point is spot-on but the important thing to consider
in this situation is that check constraints are assumed to be
immutable and if you implement a check function that is not you
don't get to complain what you see something broken.  The nature
and use of an immutable check constraint only has a single
dynamic - execute the function using the given values once for
every record INSERT or UPDATE.  There is no reason, and I
suspect there is no actual, attempt to even look at the
volatility category of said function before performing those
actions.  It is possible that two records inserted or updated in
the same query could make use of the caching possibilities
afforded by immutable functions but if so assume it is being
done unconditionally.

David J.

​Your point about ".. check ​constraints are assumed to be immutable
..", is that in the manual? Because I don't remember reading it in
the constraints section, nor in the volatility categories section,
nor in the server programming sections. Granted, I haven't read the
whole manual yet nor do I have what I've read so far memorized, but
I think that little fact would have struck a cord in my gray matter.
So if you can point me to the spot in the manual where this is
covered I would appreciate it.​



​http://www.postgresql.org/docs/9.5/interactive/sql-createtable.html​
​Second Paragraph​

​"""​
  CHECK ( expression ) [ NO INHERIT ]
The CHECK clause specifies an expression producing a Boolean result
which new or updated rows must satisfy for an insert or update operation
to succeed. Expressions evaluating to TRUE or UNKNOWN succeed. Should
any row of an insert or update operation produce a FALSE result, an
error exception is raised and the insert or update does not alter the
database. A check constraint specified as a column constraint should
reference that column's value only, while an expression appearing in a
table constraint can reference multiple columns.

Currently, CHECK expressions cannot contain subqueries nor refer to
variables other than columns of the current row. The system column
tableoid may be referenced, but not any other system column.

A constraint marked with NO INHERIT will not propagate to child tables.

When a table has multiple CHECK constraints, they will be tested for
each row in alphabetical order by name, after checking NOT NULL
constraints. (PostgreSQL versions before 9.5 did not honor any
particular firing order for CHECK constraints.)
​"""

While you've managed to fool the system by wrapping your query into a
function you've violated the documented restrictions and so any breakage
is on you - not the system.


As an example of where this leads see:

http://www.postgresql.org/message-id/7224.1452275...@sss.pgh.pa.us



Also, consider that at the time you insert a row the check constraint
passes but then you alter the other table so that, if you tried to
insert the row again it would fail.  Since check constraints are only
evaluated upon INSERT/UPDATE of the data on the same table you would
have a violation.

So, while the documentation doesn't explicitly say that functions used
in CHECK must be IMMUTABLE that is what it all boils down to when you
put all of these things together.

David J.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread David G. Johnston
On Mon, Feb 1, 2016 at 1:30 PM, Dane Foster  wrote:

>
> On Mon, Feb 1, 2016 at 2:56 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Mon, Feb 1, 2016 at 12:41 PM, Adrian Klaver > > wrote:
>>
>>> On 02/01/2016 11:17 AM, Dane Foster wrote:
>>>
 Hello,

 I'm discovering that I need to write quite a few functions for use
 strictly w/ check constraints and I'm wondering if declaring the
 volatility category for said functions will affect their behavior when
 invoked by PostgreSQL's check constraint mechanism.

>>>
>> ​Adrian's point is spot-on but the important thing to consider in this
>> situation is that check constraints are assumed to be immutable and if you
>> implement a check function that is not you don't get to complain what you
>> see something broken.  The nature and use of an immutable check constraint
>> only has a single dynamic - execute the function using the given values
>> once for every record INSERT or UPDATE.  There is no reason, and I suspect
>> there is no actual, attempt to even look at the volatility category of said
>> function before performing those actions.  It is possible that two records
>> inserted or updated in the same query could make use of the caching
>> possibilities afforded by immutable functions but if so assume it is being
>> done unconditionally.
>>
>> David J.
>>
>> ​Your point about ".. check ​constraints are assumed to be immutable ..",
> is that in the manual? Because I don't remember reading it in the
> constraints section, nor in the volatility categories section, nor in the
> server programming sections. Granted, I haven't read the whole manual yet
> nor do I have what I've read so far memorized, but I think that little fact
> would have struck a cord in my gray matter. So if you can point me to the
> spot in the manual where this is covered I would appreciate it.​
>
>
>
​http://www.postgresql.org/docs/9.5/interactive/sql-createtable.html​
​Second Paragraph​

​"""​
 CHECK ( expression ) [ NO INHERIT ]
The CHECK clause specifies an expression producing a Boolean result which
new or updated rows must satisfy for an insert or update operation to
succeed. Expressions evaluating to TRUE or UNKNOWN succeed. Should any row
of an insert or update operation produce a FALSE result, an error exception
is raised and the insert or update does not alter the database. A check
constraint specified as a column constraint should reference that column's
value only, while an expression appearing in a table constraint can
reference multiple columns.

Currently, CHECK expressions cannot contain subqueries nor refer to
variables other than columns of the current row. The system column tableoid
may be referenced, but not any other system column.

A constraint marked with NO INHERIT will not propagate to child tables.

When a table has multiple CHECK constraints, they will be tested for each
row in alphabetical order by name, after checking NOT NULL constraints.
(PostgreSQL versions before 9.5 did not honor any particular firing order
for CHECK constraints.)
​"""

While you've managed to fool the system by wrapping your query into a
function you've violated the documented restrictions and so any breakage is
on you - not the system.

Also, consider that at the time you insert a row the check constraint
passes but then you alter the other table so that, if you tried to insert
the row again it would fail.  Since check constraints are only evaluated
upon INSERT/UPDATE of the data on the same table you would have a violation.

So, while the documentation doesn't explicitly say that functions used in
CHECK must be IMMUTABLE that is what it all boils down to when you put all
of these things together.

David J.


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Dane Foster
On Mon, Feb 1, 2016 at 3:22 PM, Dane Foster  wrote:

>
> On Mon, Feb 1, 2016 at 2:41 PM, Adrian Klaver 
> wrote:
>
>> On 02/01/2016 11:17 AM, Dane Foster wrote:
>>
>>> Hello,
>>>
>>> I'm discovering that I need to write quite a few functions for use
>>> strictly w/ check constraints and I'm wondering if declaring the
>>> volatility category for said functions will affect their behavior when
>>> invoked by PostgreSQL's check constraint mechanism.
>>>
>>> Essentially what I'm trying to figure out is if volatility categories
>>> increase or decrease the latency of executing check constraints. I've
>>> done some micro benchmarks but I have no experience benchmarking
>>> anything in PostgreSQL to trust that I'm measuring the right thing. So
>>> I'm asking the experts.
>>>
>>
>> The above is sort of backwards. You need to ask what the function does
>> and from that determine what is the most appropriate volatitity category.
>> For more detailed info see:
>>
>> http://www.postgresql.org/docs/9.5/interactive/xfunc-volatility.html
>>
>
> ​I did that already and all of the ones written so far would be STABLE.
> The gist of all of them is they check for the presence or absence of a
> particular type of thing to exist in some other table. Unfortunately the
> "type" definition can't be expressed as a primary key so I can't use
> foreign keys to enforce consistency.
> ​
>
>
>> It would help to see some samples of the actual functions.
>
> ​-- $1: The class event primary key
> -- $2: The discount code
> CREATE FUNCTION discounts_enabled_for(INT, CITXT70) RETURNS BOOLEAN AS $$
> SELECT NOT
>   CASE (SELECT type FROM discount_codes WHERE code = $2)
> WHEN 'giftcert'::DISC_CODE_TYPE THEN TRUE
> WHEN 'coupon'::DISC_CODE_TYPE
>   THEN (SELECT no_coupons FROM class_events WHERE cid = $1)
>   ELSE (SELECT no_vouchers FROM class_events WHERE cid = $1)
>   END;
> $$ LANGUAGE SQL STRICT LEAKPROOF;
> COMMENT ON FUNCTION discounts_enabled_for(INT, CITXT70) IS
> $$Determines if a class event accepts coupon or voucher discounts.$$;
>
> CREATE TABLE group_codes (
>   cid  INTEGER
>PRIMARY KEY
>REFERENCES class_events ON DELETE CASCADE ON UPDATE CASCADE,
>   code CITXT70
>NOT NULL
>REFERENCES discount_codes ON DELETE CASCADE ON UPDATE CASCADE,
>   CONSTRAINT discounts_enabled CHECK (discounts_enabled_for(cid, code))
> );
> CREATE INDEX ON group_codes USING GIN (code);
> COMMENT ON TABLE group_codes IS
> $$Discount codes that are exclusive to a set of class events.$$;
>

​I just realized there is little bug in the function body. So before anyone
gets distracted by it I wanted to let you know that I know it exists and
has been fixed internally.​

​
>


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Dane Foster
On Mon, Feb 1, 2016 at 2:56 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Mon, Feb 1, 2016 at 12:41 PM, Adrian Klaver 
> wrote:
>
>> On 02/01/2016 11:17 AM, Dane Foster wrote:
>>
>>> Hello,
>>>
>>> I'm discovering that I need to write quite a few functions for use
>>> strictly w/ check constraints and I'm wondering if declaring the
>>> volatility category for said functions will affect their behavior when
>>> invoked by PostgreSQL's check constraint mechanism.
>>>
>>
> ​Adrian's point is spot-on but the important thing to consider in this
> situation is that check constraints are assumed to be immutable and if you
> implement a check function that is not you don't get to complain what you
> see something broken.  The nature and use of an immutable check constraint
> only has a single dynamic - execute the function using the given values
> once for every record INSERT or UPDATE.  There is no reason, and I suspect
> there is no actual, attempt to even look at the volatility category of said
> function before performing those actions.  It is possible that two records
> inserted or updated in the same query could make use of the caching
> possibilities afforded by immutable functions but if so assume it is being
> done unconditionally.
>
> David J.
>
> ​Your point about ".. check ​constraints are assumed to be immutable ..",
is that in the manual? Because I don't remember reading it in the
constraints section, nor in the volatility categories section, nor in the
server programming sections. Granted, I haven't read the whole manual yet
nor do I have what I've read so far memorized, but I think that little fact
would have struck a cord in my gray matter. So if you can point me to the
spot in the manual where this is covered I would appreciate it.​

Thanks,

Dane


Re: [GENERAL] strange sql behavior

2016-02-01 Thread Yu Nie
Hi Klaver,

Thanks.

1. I don't see order by time makes a difference - in fact, the "analyze"
seems to indicate the sorting is faster for the small table because it uses
less memory.

2. No, the large table has not been clustered.   Both tables were created
exactly the same way,  loading 5-minute blocks of GIS points of all taxis
through a  "copy" command.
When all data are loaded, two indexes  are created, one on the taxi id,
and the other on the time stamp.

3. I could not run the last test you suggested since I don't have the table
- it would take several hours to create a table with 10 days of data from
the same month.


On Mon, Feb 1, 2016 at 2:21 PM, Adrian Klaver 
wrote:

> On 02/01/2016 10:35 AM, Yu Nie wrote:
>
>> Hi there,
>>
>> Recently I am working with a large amount of taxis GIS data and had
>> encountered some weird performance issues.  I am hoping someone in this
>> community can help me figure it out.
>>
>> The taxi data were loaded in 5 minute block into a table.  I have two
>> separate such tables, one stores a month of data with about 700 million
>> rows, another stores about 10 days of data with about 300 million rows.
>> The two tables have the exactly same schema and indexes. There are two
>> indexes: one on taxiid (text), and the other on the time stamp (date
>> time).  In order to process the data, I need to get all points for a
>> single taxis; to do that, I use something like:
>>   select * from table1 where taxiid = 'SZB00S41' order by time;
>> What puzzled me greatly is that this query runs consistently much faster
>> for the large table than for the small table, which seems to contradict
>> with intuition.   At the end of message you may find explain (analyze
>> buffer) results of two particular queries for the same taxiid (one for
>> each table). You can see that it took much longer (more than 20 times)
>> to get 20k rows from the small table than to get 44 k rows from the
>> large table.   Interestingly it seems that the planner does expect about
>> 1/3 work for the small table query - yet for some reason, it took much
>> longer to fetch the rows from the small table.   Why there is such a
>> huge performance between the two seemingly identical queries executed on
>> two different tables?
>>
>> Is is because the data on the second table is on some mysteriously
>> "broken part" of the disk?  what else could explain such a bizarre
>> behavior? Your help is greatly appreciated.
>>
>> The above behavior is consistent through all queries.   Another issue I
>> identified is that for the large table, the query can use the shared
>> buffer more effectively.  For example, after you query one taxiid and
>> immediately following that query run the same query for another taxi
>> whose id ranks right behind the first id, then shared hit buffers would
>> be quite high (and the query would run much faster); this however never
>> works for the small table.
>>
>
>
> Looks to me the time is being taken up by the 'order by time' portion of
> the query.
>
> What happens if run the queries without 'order by time'?
>
> What is the history of the large table, has it been CLUSTERed in the past
> for instance?
>
> If I am following the table names correctly it looks like the data is two
> years apart in the large and small tables. Do you see the same issues when
> you run the query on data for table with a month of data and then a table
> with 10 days of data from the same month?
>
>
>
>> Thanks   a lot!
>>
>> Best, Marco
>>
>>
>> Results for the small table: it took 141 seconds to finish.  The
>> planning time is 85256.31
>>
>> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
>> time=141419.499..141420.025 rows=20288 loops=1)"
>> "  Sort Key: "time""
>> "  Sort Method: quicksort  Memory: 3622kB"
>> "  Buffers: shared hit=92 read=19816"
>> " ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27
>> rows=22101 width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
>> "Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>> "Heap Blocks: exact=19826"
>> "Buffers: shared hit=92 read=19816"
>> " ->  Bitmap Index Scan on data2013_01w_ixtaxiid  (cost=0.00..510.33
>> rows=22101 width=0) (actual time=26.053..26.053 rows=20288 loops=1)"
>> "  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
>> "  Buffers: shared hit=4 read=78"
>> "Planning time: 0.144 ms"
>> "Execution time: 141421.154 ms"
>>
>> Results for the large table: it took 5 seconds to finish.  The planning
>> time is 252077.10
>> "Sort  (cost=251913.32..252077.10 rows=65512 width=55) (actual
>> time=5038.571..5039.765 rows=44204 loops=1)"
>> "  Sort Key: "time""
>> "  Sort Method: quicksort  Memory: 7753kB"
>> "  Buffers: shared hit=2 read=7543"
>> " ->  Bitmap Heap Scan on data2011_01  (cost=1520.29..246672.53
>> rows=65512 width=55) (actual time=36.935..5017.463 rows=44204 loops=1)"
>> "Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>> "Heap Blocks: ex

Re: [GENERAL] strange sql behavior

2016-02-01 Thread Adrian Klaver

On 02/01/2016 10:35 AM, Yu Nie wrote:

Hi there,

Recently I am working with a large amount of taxis GIS data and had
encountered some weird performance issues.  I am hoping someone in this
community can help me figure it out.

The taxi data were loaded in 5 minute block into a table.  I have two
separate such tables, one stores a month of data with about 700 million
rows, another stores about 10 days of data with about 300 million rows.
The two tables have the exactly same schema and indexes. There are two
indexes: one on taxiid (text), and the other on the time stamp (date
time).  In order to process the data, I need to get all points for a
single taxis; to do that, I use something like:
  select * from table1 where taxiid = 'SZB00S41' order by time;
What puzzled me greatly is that this query runs consistently much faster
for the large table than for the small table, which seems to contradict
with intuition.   At the end of message you may find explain (analyze
buffer) results of two particular queries for the same taxiid (one for
each table). You can see that it took much longer (more than 20 times)
to get 20k rows from the small table than to get 44 k rows from the
large table.   Interestingly it seems that the planner does expect about
1/3 work for the small table query - yet for some reason, it took much
longer to fetch the rows from the small table.   Why there is such a
huge performance between the two seemingly identical queries executed on
two different tables?

Is is because the data on the second table is on some mysteriously
"broken part" of the disk?  what else could explain such a bizarre
behavior? Your help is greatly appreciated.

The above behavior is consistent through all queries.   Another issue I
identified is that for the large table, the query can use the shared
buffer more effectively.  For example, after you query one taxiid and
immediately following that query run the same query for another taxi
whose id ranks right behind the first id, then shared hit buffers would
be quite high (and the query would run much faster); this however never
works for the small table.



Looks to me the time is being taken up by the 'order by time' portion of 
the query.


What happens if run the queries without 'order by time'?

What is the history of the large table, has it been CLUSTERed in the 
past for instance?


If I am following the table names correctly it looks like the data is 
two years apart in the large and small tables. Do you see the same 
issues when you run the query on data for table with a month of data 
and then a table with 10 days of data from the same month?




Thanks   a lot!

Best, Marco


Results for the small table: it took 141 seconds to finish.  The
planning time is 85256.31

"Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
time=141419.499..141420.025 rows=20288 loops=1)"
"  Sort Key: "time""
"  Sort Method: quicksort  Memory: 3622kB"
"  Buffers: shared hit=92 read=19816"
" ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27
rows=22101 width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
"Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
"Heap Blocks: exact=19826"
"Buffers: shared hit=92 read=19816"
" ->  Bitmap Index Scan on data2013_01w_ixtaxiid  (cost=0.00..510.33
rows=22101 width=0) (actual time=26.053..26.053 rows=20288 loops=1)"
"  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
"  Buffers: shared hit=4 read=78"
"Planning time: 0.144 ms"
"Execution time: 141421.154 ms"

Results for the large table: it took 5 seconds to finish.  The planning
time is 252077.10
"Sort  (cost=251913.32..252077.10 rows=65512 width=55) (actual
time=5038.571..5039.765 rows=44204 loops=1)"
"  Sort Key: "time""
"  Sort Method: quicksort  Memory: 7753kB"
"  Buffers: shared hit=2 read=7543"
" ->  Bitmap Heap Scan on data2011_01  (cost=1520.29..246672.53
rows=65512 width=55) (actual time=36.935..5017.463 rows=44204 loops=1)"
"Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
"Heap Blocks: exact=7372"
"Buffers: shared hit=2 read=7543"
" ->  Bitmap Index Scan on data2011_01_ixtaxiid  (cost=0.00..1503.92
rows=65512 width=0) (actual time=35.792..35.792 rows=44204 loops=1)"
"  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
"  Buffers: shared hit=2 read=171"
"Planning time: 0.127 ms"
"Execution time: 5042.134 ms"



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Dane Foster
On Mon, Feb 1, 2016 at 2:41 PM, Adrian Klaver 
wrote:

> On 02/01/2016 11:17 AM, Dane Foster wrote:
>
>> Hello,
>>
>> I'm discovering that I need to write quite a few functions for use
>> strictly w/ check constraints and I'm wondering if declaring the
>> volatility category for said functions will affect their behavior when
>> invoked by PostgreSQL's check constraint mechanism.
>>
>> Essentially what I'm trying to figure out is if volatility categories
>> increase or decrease the latency of executing check constraints. I've
>> done some micro benchmarks but I have no experience benchmarking
>> anything in PostgreSQL to trust that I'm measuring the right thing. So
>> I'm asking the experts.
>>
>
> The above is sort of backwards. You need to ask what the function does and
> from that determine what is the most appropriate volatitity category. For
> more detailed info see:
>
> http://www.postgresql.org/docs/9.5/interactive/xfunc-volatility.html
>

​I did that already and all of the ones written so far would be STABLE. The
gist of all of them is they check for the presence or absence of a
particular type of thing to exist in some other table. Unfortunately the
"type" definition can't be expressed as a primary key so I can't use
foreign keys to enforce consistency.
​


> It would help to see some samples of the actual functions.

​-- $1: The class event primary key
-- $2: The discount code
CREATE FUNCTION discounts_enabled_for(INT, CITXT70) RETURNS BOOLEAN AS $$
SELECT NOT
  CASE (SELECT type FROM discount_codes WHERE code = $2)
WHEN 'giftcert'::DISC_CODE_TYPE THEN TRUE
WHEN 'coupon'::DISC_CODE_TYPE
  THEN (SELECT no_coupons FROM class_events WHERE cid = $1)
  ELSE (SELECT no_vouchers FROM class_events WHERE cid = $1)
  END;
$$ LANGUAGE SQL STRICT LEAKPROOF;
COMMENT ON FUNCTION discounts_enabled_for(INT, CITXT70) IS
$$Determines if a class event accepts coupon or voucher discounts.$$;

CREATE TABLE group_codes (
  cid  INTEGER
   PRIMARY KEY
   REFERENCES class_events ON DELETE CASCADE ON UPDATE CASCADE,
  code CITXT70
   NOT NULL
   REFERENCES discount_codes ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT discounts_enabled CHECK (discounts_enabled_for(cid, code))
);
CREATE INDEX ON group_codes USING GIN (code);
COMMENT ON TABLE group_codes IS
$$Discount codes that are exclusive to a set of class events.$$;
​

>
>
>> I'm running PostgreSQL 9.5 on Ubuntu Linux 15.10. I don't know if this
>> matters but this is my workstation which is a pretty zippy AlienWare X51
>> w/ 16GB RAM on a Core i7-4770 processor.
>>
>> Thanks,
>>
>> Dane
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: [GENERAL] strange sql behavior

2016-02-01 Thread melvin6925
And were _your_ queries run on the same day at the same time within a few 
seconds of each other?
Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone Original 
message From: Yu Nie  Date: 2/1/2016  15:01  
(GMT-05:00) To: melvin6925  Cc: Bill Moran 
, pgsql-general@postgresql.org Subject: Re: [GENERAL] 
strange sql behavior 
Thanks, Bill and Melvin!

Just some quick note/answers before I absorb all the information provided by 
Bill. 

1. I don't expect many users running queries against the tables, especially for 
the small table - since I just created it this morning, and nobody know about 
it except myself.  

2. The setting in the config:


shared_buffers = 512MB            # min 128kB
work_mem = 128MB                # min 64kB

3. I am running a Windows 7 with 24 GB RAM.  and my postgresal is 9.4. 

4.  here is the query I ran:

SELECT n.nspname,
   s.relname,
   c.reltuples::bigint,
   n_tup_ins,
   n_tup_upd,
   n_tup_del,
   date_trunc('second', last_vacuum) as last_vacuum,
   date_trunc('second', last_autovacuum) as last_autovacuum,
   date_trunc('second', last_analyze) as last_analyze,
   date_trunc('second', last_autoanalyze) as last_autoanalyze
   ,
   round( current_setting('autovacuum_vacuum_threshold')::integer + 
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS 
av_threshold
  FROM pg_stat_all_tables s
  JOIN pg_class c ON c.oid = s.relid
  JOIN pg_namespace n ON (n.oid = c.relnamespace)
 WHERE s.relname NOT LIKE 'pg_%'
   AND s.relname NOT LIKE 'sql_%'
   AND s.relname IN  ('data2013_01w', 'data2011_01')
 ORDER by 1, 2;

I copied the result from PGAdmin directly, here it is again.:


 
 
  public;"data2011_01";784236864;784236885;0;0;"";"";"";"2016-01-19
  17:31:08-06";156847423
  
  
 
 
  public;"data2013_01w";300786432;300786444;0;0;"";"";"2016-02-01
  08:57:24-06";"2016-02-01 04:01:04-06";60157336
 


On Mon, Feb 1, 2016 at 1:45 PM, melvin6925  wrote:
Thanks Bill.Also, it's very important to include the headers with the queries!
Marco,There is no top secret information that is requested, so please do not 
edit the output.


Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone Original 
message From: Bill Moran  Date: 2/1/2016  
14:41  (GMT-05:00) To: Yu Nie  Cc: Melvin Davidson 
, pgsql-general@postgresql.org Subject: Re: [GENERAL] 
strange sql behavior 

Came a little late to the thread, see many comments inline below:

On Mon, 1 Feb 2016 13:16:13 -0600
Yu Nie  wrote:

> Thanks  a lot for your reply. I ran the query you suggested and here are
> the results
> 
> Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
> 17:31:08-06";156847423
> Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
> 08:57:24-06";"2016-02-01 04:01:04-06";60157336

You didn't do Melvin's query correctly. He specified 11 columns to
select, but you only show 10. Since you don't show the query you
actually ran, we don't know which of the numeric columns is missing.

More information inline below:

> 
> On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson 
> wrote:
> 
> > One thing to look at is the last time both tables were vacuumed/analyzed.
> >
> > SELECT n.nspname,
> >    s.relname,
> >    c.reltuples::bigint,
> >    n_tup_ins,
> >    n_tup_upd,
> >    n_tup_del,
> >    date_trunc('second', last_vacuum) as last_vacuum,
> >    date_trunc('second', last_autovacuum) as last_autovacuum,
> >    date_trunc('second', last_analyze) as last_analyze,
> >    date_trunc('second', last_autoanalyze) as last_autoanalyze
> >    ,
> >    round( current_setting('autovacuum_vacuum_threshold')::integer +
> > current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
> > AS av_threshold
> >   FROM pg_stat_all_tables s
> >   JOIN pg_class c ON c.oid = s.relid
> >   JOIN pg_namespace n ON (n.oid = c.relnamespace)
> >  WHERE s.relname NOT LIKE 'pg_%'
> >    AND s.relname NOT LIKE 'sql_%'
> >    AND s.relname IN  ' "your_small_table", "your_large_table"'
> >  ORDER by 1, 2;
> >
> >
> > Also, please confirm the indexes for both tables are using the same method
> > (btree?).
> >
> > On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie  wrote:
> >
> >> Hi there,
> >>
> >> Recently I am working with a large amount of taxis GIS data and had
> >> encountered some weird performance issues.  I am hoping someone in this
> >> community can help me figure it out.
> >>
> >> The taxi data were loaded in 5 minute block into a table.  I have two
> >> separate such tables, one stores a month of data with about 700 million
> >> rows, another stores about 10 days of data with about 300 million rows.
> >> The two tables have the exactly same schema and indexes. There are two
> >> indexes: one on taxiid (text), and the other on the time stamp (date
> >> time).  In order to process the data, I need to get all points for a single
> >> taxis; to do that, I use somet

Re: [GENERAL] strange sql behavior

2016-02-01 Thread Yu Nie
Thanks, Bill and Melvin!

Just some quick note/answers before I absorb all the information provided
by Bill.

1. I don't expect many users running queries against the tables, especially
for the small table - since I just created it this morning, and nobody know
about it except myself.

2. The setting in the config:


shared_buffers = 512MB# min 128kB
work_mem = 128MB# min 64kB

3. I am running a Windows 7 with 24 GB RAM.  and my postgresal is 9.4.

4.  here is the query I ran:

SELECT n.nspname,
   s.relname,
   c.reltuples::bigint,
   n_tup_ins,
   n_tup_upd,
   n_tup_del,
   date_trunc('second', last_vacuum) as last_vacuum,
   date_trunc('second', last_autovacuum) as last_autovacuum,
   date_trunc('second', last_analyze) as last_analyze,
   date_trunc('second', last_autoanalyze) as last_autoanalyze
   ,
   round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
AS av_threshold
  FROM pg_stat_all_tables s
  JOIN pg_class c ON c.oid = s.relid
  JOIN pg_namespace n ON (n.oid = c.relnamespace)
 WHERE s.relname NOT LIKE 'pg_%'
   AND s.relname NOT LIKE 'sql_%'
   AND s.relname IN  ('data2013_01w', 'data2011_01')
 ORDER by 1, 2;

I copied the result from PGAdmin directly, here it is again.:

public;"data2011_01";784236864;784236885;0;0;"";"";"";"2016-01-19
17:31:08-06";156847423
public;"data2013_01w";300786432;300786444;0;0;"";"";"2016-02-01
08:57:24-06";"2016-02-01 04:01:04-06";60157336

On Mon, Feb 1, 2016 at 1:45 PM, melvin6925  wrote:

> Thanks Bill.
> Also, it's very important to include the headers with the queries!
>
> Marco,
> There is no top secret information that is requested, so please do not
> edit the output.
>
>
>
> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>  Original message 
> From: Bill Moran 
> Date: 2/1/2016 14:41 (GMT-05:00)
> To: Yu Nie 
> Cc: Melvin Davidson , pgsql-general@postgresql.org
> Subject: Re: [GENERAL] strange sql behavior
>
>
> Came a little late to the thread, see many comments inline below:
>
> On Mon, 1 Feb 2016 13:16:13 -0600
> Yu Nie  wrote:
>
> > Thanks  a lot for your reply. I ran the query you suggested and here are
> > the results
> >
> > Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
> > 17:31:08-06";156847423
> > Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
> > 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>
> You didn't do Melvin's query correctly. He specified 11 columns to
> select, but you only show 10. Since you don't show the query you
> actually ran, we don't know which of the numeric columns is missing.
>
> More information inline below:
>
> >
> > On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson 
> > wrote:
> >
> > > One thing to look at is the last time both tables were
> vacuumed/analyzed.
> > >
> > > SELECT n.nspname,
> > >s.relname,
> > >c.reltuples::bigint,
> > >n_tup_ins,
> > >n_tup_upd,
> > >n_tup_del,
> > >date_trunc('second', last_vacuum) as last_vacuum,
> > >date_trunc('second', last_autovacuum) as last_autovacuum,
> > >date_trunc('second', last_analyze) as last_analyze,
> > >date_trunc('second', last_autoanalyze) as last_autoanalyze
> > >,
> > >round( current_setting('autovacuum_vacuum_threshold')::integer +
> > > current_setting('autovacuum_vacuum_scale_factor')::numeric *
> C.reltuples)
> > > AS av_threshold
> > >   FROM pg_stat_all_tables s
> > >   JOIN pg_class c ON c.oid = s.relid
> > >   JOIN pg_namespace n ON (n.oid = c.relnamespace)
> > >  WHERE s.relname NOT LIKE 'pg_%'
> > >AND s.relname NOT LIKE 'sql_%'
> > >AND s.relname IN  ' "your_small_table", "your_large_table"'
> > >  ORDER by 1, 2;
> > >
> > >
> > > Also, please confirm the indexes for both tables are using the same
> method
> > > (btree?).
> > >
> > > On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie  wrote:
> > >
> > >> Hi there,
> > >>
> > >> Recently I am working with a large amount of taxis GIS data and had
> > >> encountered some weird performance issues.  I am hoping someone in
> this
> > >> community can help me figure it out.
> > >>
> > >> The taxi data were loaded in 5 minute block into a table.  I have two
> > >> separate such tables, one stores a month of data with about 700
> million
> > >> rows, another stores about 10 days of data with about 300 million
> rows.
> > >> The two tables have the exactly same schema and indexes. There are two
> > >> indexes: one on taxiid (text), and the other on the time stamp (date
> > >> time).  In order to process the data, I need to get all points for a
> single
> > >> taxis; to do that, I use something like:
> > >>  select * from table1 where taxiid = 'SZB00S41' order by time;
> > >> What puzzled me greatly is that this query runs consistently much
> faster
> > >> for the large table than for the small table, wh

Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread David G. Johnston
On Mon, Feb 1, 2016 at 12:41 PM, Adrian Klaver 
wrote:

> On 02/01/2016 11:17 AM, Dane Foster wrote:
>
>> Hello,
>>
>> I'm discovering that I need to write quite a few functions for use
>> strictly w/ check constraints and I'm wondering if declaring the
>> volatility category for said functions will affect their behavior when
>> invoked by PostgreSQL's check constraint mechanism.
>>
>
​Adrian's point is spot-on but the important thing to consider in this
situation is that check constraints are assumed to be immutable and if you
implement a check function that is not you don't get to complain what you
see something broken.  The nature and use of an immutable check constraint
only has a single dynamic - execute the function using the given values
once for every record INSERT or UPDATE.  There is no reason, and I suspect
there is no actual, attempt to even look at the volatility category of said
function before performing those actions.  It is possible that two records
inserted or updated in the same query could make use of the caching
possibilities afforded by immutable functions but if so assume it is being
done unconditionally.

David J.


Re: [GENERAL] strange sql behavior

2016-02-01 Thread melvin6925
Thanks Bill.Also, it's very important to include the headers with the queries!
Marco,There is no top secret information that is requested, so please do not 
edit the output.


Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone Original 
message From: Bill Moran  Date: 2/1/2016  
14:41  (GMT-05:00) To: Yu Nie  Cc: Melvin Davidson 
, pgsql-general@postgresql.org Subject: Re: [GENERAL] 
strange sql behavior 

Came a little late to the thread, see many comments inline below:

On Mon, 1 Feb 2016 13:16:13 -0600
Yu Nie  wrote:

> Thanks  a lot for your reply. I ran the query you suggested and here are
> the results
> 
> Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
> 17:31:08-06";156847423
> Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
> 08:57:24-06";"2016-02-01 04:01:04-06";60157336

You didn't do Melvin's query correctly. He specified 11 columns to
select, but you only show 10. Since you don't show the query you
actually ran, we don't know which of the numeric columns is missing.

More information inline below:

> 
> On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson 
> wrote:
> 
> > One thing to look at is the last time both tables were vacuumed/analyzed.
> >
> > SELECT n.nspname,
> >    s.relname,
> >    c.reltuples::bigint,
> >    n_tup_ins,
> >    n_tup_upd,
> >    n_tup_del,
> >    date_trunc('second', last_vacuum) as last_vacuum,
> >    date_trunc('second', last_autovacuum) as last_autovacuum,
> >    date_trunc('second', last_analyze) as last_analyze,
> >    date_trunc('second', last_autoanalyze) as last_autoanalyze
> >    ,
> >    round( current_setting('autovacuum_vacuum_threshold')::integer +
> > current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
> > AS av_threshold
> >   FROM pg_stat_all_tables s
> >   JOIN pg_class c ON c.oid = s.relid
> >   JOIN pg_namespace n ON (n.oid = c.relnamespace)
> >  WHERE s.relname NOT LIKE 'pg_%'
> >    AND s.relname NOT LIKE 'sql_%'
> >    AND s.relname IN  ' "your_small_table", "your_large_table"'
> >  ORDER by 1, 2;
> >
> >
> > Also, please confirm the indexes for both tables are using the same method
> > (btree?).
> >
> > On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie  wrote:
> >
> >> Hi there,
> >>
> >> Recently I am working with a large amount of taxis GIS data and had
> >> encountered some weird performance issues.  I am hoping someone in this
> >> community can help me figure it out.
> >>
> >> The taxi data were loaded in 5 minute block into a table.  I have two
> >> separate such tables, one stores a month of data with about 700 million
> >> rows, another stores about 10 days of data with about 300 million rows.
> >> The two tables have the exactly same schema and indexes. There are two
> >> indexes: one on taxiid (text), and the other on the time stamp (date
> >> time).  In order to process the data, I need to get all points for a single
> >> taxis; to do that, I use something like:
> >>  select * from table1 where taxiid = 'SZB00S41' order by time;
> >> What puzzled me greatly is that this query runs consistently much faster
> >> for the large table than for the small table, which seems to contradict
> >> with intuition.   At the end of message you may find explain (analyze
> >> buffer) results of two particular queries for the same taxiid (one for each
> >> table). You can see that it took much longer (more than 20 times) to get
> >> 20k rows from the small table than to get 44 k rows from the large table.
> >> Interestingly it seems that the planner does expect about 1/3 work for the
> >> small table query - yet for some reason, it took much longer to fetch the
> >> rows from the small table.   Why there is such a huge performance between
> >> the two seemingly identical queries executed on two different tables?
> >>
> >> Is is because the data on the second table is on some mysteriously
> >> "broken part" of the disk?  what else could explain such a bizarre
> >> behavior?  Your help is greatly appreciated.
> >>
> >> The above behavior is consistent through all queries.   Another issue I
> >> identified is that for the large table, the query can use the shared buffer
> >> more effectively.  For example, after you query one taxiid and immediately
> >> following that query run the same query for another taxi whose id ranks
> >> right behind the first id, then shared hit buffers would be quite high (and
> >> the query would run much faster); this however never works for the small
> >> table.
> >>
> >> Thanks   a lot!
> >>
> >> Best, Marco
> >>
> >>
> >> Results for the small table: it took 141 seconds to finish.  The planning
> >> time is 85256.31
> >>
> >> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
> >> time=141419.499..141420.025 rows=20288 loops=1)"
> >> "  Sort Key: "time""
> >> "  Sort Method: quicksort  Memory: 3622kB"
> >> "  Buffers: shared hit=92 read=19816"
> >> "  ->  Bitmap Heap Scan on data2013_01w  (co

Re: [GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Adrian Klaver

On 02/01/2016 11:17 AM, Dane Foster wrote:

Hello,

I'm discovering that I need to write quite a few functions for use
strictly w/ check constraints and I'm wondering if declaring the
volatility category for said functions will affect their behavior when
invoked by PostgreSQL's check constraint mechanism.

Essentially what I'm trying to figure out is if volatility categories
increase or decrease the latency of executing check constraints. I've
done some micro benchmarks but I have no experience benchmarking
anything in PostgreSQL to trust that I'm measuring the right thing. So
I'm asking the experts.


The above is sort of backwards. You need to ask what the function does 
and from that determine what is the most appropriate volatitity 
category. For more detailed info see:


http://www.postgresql.org/docs/9.5/interactive/xfunc-volatility.html

It would help to see some samples of the actual functions.



I'm running PostgreSQL 9.5 on Ubuntu Linux 15.10. I don't know if this
matters but this is my workstation which is a pretty zippy AlienWare X51
w/ 16GB RAM on a Core i7-4770 processor.

Thanks,

Dane



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] strange sql behavior

2016-02-01 Thread Bill Moran

Came a little late to the thread, see many comments inline below:

On Mon, 1 Feb 2016 13:16:13 -0600
Yu Nie  wrote:

> Thanks  a lot for your reply. I ran the query you suggested and here are
> the results
> 
> Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
> 17:31:08-06";156847423
> Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
> 08:57:24-06";"2016-02-01 04:01:04-06";60157336

You didn't do Melvin's query correctly. He specified 11 columns to
select, but you only show 10. Since you don't show the query you
actually ran, we don't know which of the numeric columns is missing.

More information inline below:

> 
> On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson 
> wrote:
> 
> > One thing to look at is the last time both tables were vacuumed/analyzed.
> >
> > SELECT n.nspname,
> >s.relname,
> >c.reltuples::bigint,
> >n_tup_ins,
> >n_tup_upd,
> >n_tup_del,
> >date_trunc('second', last_vacuum) as last_vacuum,
> >date_trunc('second', last_autovacuum) as last_autovacuum,
> >date_trunc('second', last_analyze) as last_analyze,
> >date_trunc('second', last_autoanalyze) as last_autoanalyze
> >,
> >round( current_setting('autovacuum_vacuum_threshold')::integer +
> > current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
> > AS av_threshold
> >   FROM pg_stat_all_tables s
> >   JOIN pg_class c ON c.oid = s.relid
> >   JOIN pg_namespace n ON (n.oid = c.relnamespace)
> >  WHERE s.relname NOT LIKE 'pg_%'
> >AND s.relname NOT LIKE 'sql_%'
> >AND s.relname IN  ' "your_small_table", "your_large_table"'
> >  ORDER by 1, 2;
> >
> >
> > Also, please confirm the indexes for both tables are using the same method
> > (btree?).
> >
> > On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie  wrote:
> >
> >> Hi there,
> >>
> >> Recently I am working with a large amount of taxis GIS data and had
> >> encountered some weird performance issues.  I am hoping someone in this
> >> community can help me figure it out.
> >>
> >> The taxi data were loaded in 5 minute block into a table.  I have two
> >> separate such tables, one stores a month of data with about 700 million
> >> rows, another stores about 10 days of data with about 300 million rows.
> >> The two tables have the exactly same schema and indexes. There are two
> >> indexes: one on taxiid (text), and the other on the time stamp (date
> >> time).  In order to process the data, I need to get all points for a single
> >> taxis; to do that, I use something like:
> >>  select * from table1 where taxiid = 'SZB00S41' order by time;
> >> What puzzled me greatly is that this query runs consistently much faster
> >> for the large table than for the small table, which seems to contradict
> >> with intuition.   At the end of message you may find explain (analyze
> >> buffer) results of two particular queries for the same taxiid (one for each
> >> table). You can see that it took much longer (more than 20 times) to get
> >> 20k rows from the small table than to get 44 k rows from the large table.
> >> Interestingly it seems that the planner does expect about 1/3 work for the
> >> small table query - yet for some reason, it took much longer to fetch the
> >> rows from the small table.   Why there is such a huge performance between
> >> the two seemingly identical queries executed on two different tables?
> >>
> >> Is is because the data on the second table is on some mysteriously
> >> "broken part" of the disk?  what else could explain such a bizarre
> >> behavior?  Your help is greatly appreciated.
> >>
> >> The above behavior is consistent through all queries.   Another issue I
> >> identified is that for the large table, the query can use the shared buffer
> >> more effectively.  For example, after you query one taxiid and immediately
> >> following that query run the same query for another taxi whose id ranks
> >> right behind the first id, then shared hit buffers would be quite high (and
> >> the query would run much faster); this however never works for the small
> >> table.
> >>
> >> Thanks   a lot!
> >>
> >> Best, Marco
> >>
> >>
> >> Results for the small table: it took 141 seconds to finish.  The planning
> >> time is 85256.31
> >>
> >> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
> >> time=141419.499..141420.025 rows=20288 loops=1)"
> >> "  Sort Key: "time""
> >> "  Sort Method: quicksort  Memory: 3622kB"
> >> "  Buffers: shared hit=92 read=19816"
> >> "  ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27
> >> rows=22101 width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
> >> "Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
> >> "Heap Blocks: exact=19826"
> >> "Buffers: shared hit=92 read=19816"

^^
Note that despite this table being smaller, Postgres had to read 19816
blocks from disk, which is 2.5x more than the larger table.

> >> "->  Bitmap Index Scan on data20

[GENERAL] Check constraints and function volatility categories

2016-02-01 Thread Dane Foster
Hello,

I'm discovering that I need to write quite a few functions for use strictly
w/ check constraints and I'm wondering if declaring the volatility category
for said functions will affect their behavior when invoked by PostgreSQL's
check constraint mechanism.

Essentially what I'm trying to figure out is if volatility categories
increase or decrease the latency of executing check constraints. I've done
some micro benchmarks but I have no experience benchmarking anything in
PostgreSQL to trust that I'm measuring the right thing. So I'm asking the
experts.

I'm running PostgreSQL 9.5 on Ubuntu Linux 15.10. I don't know if this
matters but this is my workstation which is a pretty zippy AlienWare X51 w/
16GB RAM on a Core i7-4770 processor.

Thanks,

Dane


Re: [GENERAL] strange sql behavior

2016-02-01 Thread Yu Nie
Hi Melvin,

Thanks  a lot for your reply. I ran the query you suggested and here are
the results

Large table: "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
17:31:08-06";156847423
Small table:  "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
08:57:24-06";"2016-02-01 04:01:04-06";60157336

I did the "analyze" on the small table once it was created and indexed. The
table was not subject to any "update/delete/insert" operations since it was
created this morning.

Also, yes, the indexes for both tables are created using exactly same
methods: btree. (they were created using the same piece of code through a
C++ api).

Best, Marco



On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson 
wrote:

> One thing to look at is the last time both tables were vacuumed/analyzed.
>
> SELECT n.nspname,
>s.relname,
>c.reltuples::bigint,
>n_tup_ins,
>n_tup_upd,
>n_tup_del,
>date_trunc('second', last_vacuum) as last_vacuum,
>date_trunc('second', last_autovacuum) as last_autovacuum,
>date_trunc('second', last_analyze) as last_analyze,
>date_trunc('second', last_autoanalyze) as last_autoanalyze
>,
>round( current_setting('autovacuum_vacuum_threshold')::integer +
> current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
> AS av_threshold
>   FROM pg_stat_all_tables s
>   JOIN pg_class c ON c.oid = s.relid
>   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>  WHERE s.relname NOT LIKE 'pg_%'
>AND s.relname NOT LIKE 'sql_%'
>AND s.relname IN  ' "your_small_table", "your_large_table"'
>  ORDER by 1, 2;
>
>
> Also, please confirm the indexes for both tables are using the same method
> (btree?).
>
> On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie  wrote:
>
>> Hi there,
>>
>> Recently I am working with a large amount of taxis GIS data and had
>> encountered some weird performance issues.  I am hoping someone in this
>> community can help me figure it out.
>>
>> The taxi data were loaded in 5 minute block into a table.  I have two
>> separate such tables, one stores a month of data with about 700 million
>> rows, another stores about 10 days of data with about 300 million rows.
>> The two tables have the exactly same schema and indexes. There are two
>> indexes: one on taxiid (text), and the other on the time stamp (date
>> time).  In order to process the data, I need to get all points for a single
>> taxis; to do that, I use something like:
>>  select * from table1 where taxiid = 'SZB00S41' order by time;
>> What puzzled me greatly is that this query runs consistently much faster
>> for the large table than for the small table, which seems to contradict
>> with intuition.   At the end of message you may find explain (analyze
>> buffer) results of two particular queries for the same taxiid (one for each
>> table). You can see that it took much longer (more than 20 times) to get
>> 20k rows from the small table than to get 44 k rows from the large table.
>> Interestingly it seems that the planner does expect about 1/3 work for the
>> small table query - yet for some reason, it took much longer to fetch the
>> rows from the small table.   Why there is such a huge performance between
>> the two seemingly identical queries executed on two different tables?
>>
>> Is is because the data on the second table is on some mysteriously
>> "broken part" of the disk?  what else could explain such a bizarre
>> behavior?  Your help is greatly appreciated.
>>
>> The above behavior is consistent through all queries.   Another issue I
>> identified is that for the large table, the query can use the shared buffer
>> more effectively.  For example, after you query one taxiid and immediately
>> following that query run the same query for another taxi whose id ranks
>> right behind the first id, then shared hit buffers would be quite high (and
>> the query would run much faster); this however never works for the small
>> table.
>>
>> Thanks   a lot!
>>
>> Best, Marco
>>
>>
>> Results for the small table: it took 141 seconds to finish.  The planning
>> time is 85256.31
>>
>> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
>> time=141419.499..141420.025 rows=20288 loops=1)"
>> "  Sort Key: "time""
>> "  Sort Method: quicksort  Memory: 3622kB"
>> "  Buffers: shared hit=92 read=19816"
>> "  ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27
>> rows=22101 width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
>> "Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>> "Heap Blocks: exact=19826"
>> "Buffers: shared hit=92 read=19816"
>> "->  Bitmap Index Scan on data2013_01w_ixtaxiid
>> (cost=0.00..510.33 rows=22101 width=0) (actual time=26.053..26.053
>> rows=20288 loops=1)"
>> "  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
>> "  Buffers: shared hit=4 read=78"
>> "Planning time: 0.144 ms"
>> "Execution time: 141421.154 ms"
>>
>> Results for the large table: i

Re: [GENERAL] strange sql behavior

2016-02-01 Thread Melvin Davidson
One thing to look at is the last time both tables were vacuumed/analyzed.

SELECT n.nspname,
   s.relname,
   c.reltuples::bigint,
   n_tup_ins,
   n_tup_upd,
   n_tup_del,
   date_trunc('second', last_vacuum) as last_vacuum,
   date_trunc('second', last_autovacuum) as last_autovacuum,
   date_trunc('second', last_analyze) as last_analyze,
   date_trunc('second', last_autoanalyze) as last_autoanalyze
   ,
   round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
AS av_threshold
  FROM pg_stat_all_tables s
  JOIN pg_class c ON c.oid = s.relid
  JOIN pg_namespace n ON (n.oid = c.relnamespace)
 WHERE s.relname NOT LIKE 'pg_%'
   AND s.relname NOT LIKE 'sql_%'
   AND s.relname IN  ' "your_small_table", "your_large_table"'
 ORDER by 1, 2;


Also, please confirm the indexes for both tables are using the same method
(btree?).

On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie  wrote:

> Hi there,
>
> Recently I am working with a large amount of taxis GIS data and had
> encountered some weird performance issues.  I am hoping someone in this
> community can help me figure it out.
>
> The taxi data were loaded in 5 minute block into a table.  I have two
> separate such tables, one stores a month of data with about 700 million
> rows, another stores about 10 days of data with about 300 million rows.
> The two tables have the exactly same schema and indexes. There are two
> indexes: one on taxiid (text), and the other on the time stamp (date
> time).  In order to process the data, I need to get all points for a single
> taxis; to do that, I use something like:
>  select * from table1 where taxiid = 'SZB00S41' order by time;
> What puzzled me greatly is that this query runs consistently much faster
> for the large table than for the small table, which seems to contradict
> with intuition.   At the end of message you may find explain (analyze
> buffer) results of two particular queries for the same taxiid (one for each
> table). You can see that it took much longer (more than 20 times) to get
> 20k rows from the small table than to get 44 k rows from the large table.
> Interestingly it seems that the planner does expect about 1/3 work for the
> small table query - yet for some reason, it took much longer to fetch the
> rows from the small table.   Why there is such a huge performance between
> the two seemingly identical queries executed on two different tables?
>
> Is is because the data on the second table is on some mysteriously "broken
> part" of the disk?  what else could explain such a bizarre behavior?  Your
> help is greatly appreciated.
>
> The above behavior is consistent through all queries.   Another issue I
> identified is that for the large table, the query can use the shared buffer
> more effectively.  For example, after you query one taxiid and immediately
> following that query run the same query for another taxi whose id ranks
> right behind the first id, then shared hit buffers would be quite high (and
> the query would run much faster); this however never works for the small
> table.
>
> Thanks   a lot!
>
> Best, Marco
>
>
> Results for the small table: it took 141 seconds to finish.  The planning
> time is 85256.31
>
> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
> time=141419.499..141420.025 rows=20288 loops=1)"
> "  Sort Key: "time""
> "  Sort Method: quicksort  Memory: 3622kB"
> "  Buffers: shared hit=92 read=19816"
> "  ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27 rows=22101
> width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
> "Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
> "Heap Blocks: exact=19826"
> "Buffers: shared hit=92 read=19816"
> "->  Bitmap Index Scan on data2013_01w_ixtaxiid
> (cost=0.00..510.33 rows=22101 width=0) (actual time=26.053..26.053
> rows=20288 loops=1)"
> "  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
> "  Buffers: shared hit=4 read=78"
> "Planning time: 0.144 ms"
> "Execution time: 141421.154 ms"
>
> Results for the large table: it took 5 seconds to finish.  The planning
> time is 252077.10
> "Sort  (cost=251913.32..252077.10 rows=65512 width=55) (actual
> time=5038.571..5039.765 rows=44204 loops=1)"
> "  Sort Key: "time""
> "  Sort Method: quicksort  Memory: 7753kB"
> "  Buffers: shared hit=2 read=7543"
> "  ->  Bitmap Heap Scan on data2011_01  (cost=1520.29..246672.53
> rows=65512 width=55) (actual time=36.935..5017.463 rows=44204 loops=1)"
> "Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
> "Heap Blocks: exact=7372"
> "Buffers: shared hit=2 read=7543"
> "->  Bitmap Index Scan on data2011_01_ixtaxiid
> (cost=0.00..1503.92 rows=65512 width=0) (actual time=35.792..35.792
> rows=44204 loops=1)"
> "  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
> "  Buffers: shared hit=2 read=171"
> "

[GENERAL] strange sql behavior

2016-02-01 Thread Yu Nie
Hi there,

Recently I am working with a large amount of taxis GIS data and had
encountered some weird performance issues.  I am hoping someone in this
community can help me figure it out.

The taxi data were loaded in 5 minute block into a table.  I have two
separate such tables, one stores a month of data with about 700 million
rows, another stores about 10 days of data with about 300 million rows.
The two tables have the exactly same schema and indexes. There are two
indexes: one on taxiid (text), and the other on the time stamp (date
time).  In order to process the data, I need to get all points for a single
taxis; to do that, I use something like:
 select * from table1 where taxiid = 'SZB00S41' order by time;
What puzzled me greatly is that this query runs consistently much faster
for the large table than for the small table, which seems to contradict
with intuition.   At the end of message you may find explain (analyze
buffer) results of two particular queries for the same taxiid (one for each
table). You can see that it took much longer (more than 20 times) to get
20k rows from the small table than to get 44 k rows from the large table.
Interestingly it seems that the planner does expect about 1/3 work for the
small table query - yet for some reason, it took much longer to fetch the
rows from the small table.   Why there is such a huge performance between
the two seemingly identical queries executed on two different tables?

Is is because the data on the second table is on some mysteriously "broken
part" of the disk?  what else could explain such a bizarre behavior?  Your
help is greatly appreciated.

The above behavior is consistent through all queries.   Another issue I
identified is that for the large table, the query can use the shared buffer
more effectively.  For example, after you query one taxiid and immediately
following that query run the same query for another taxi whose id ranks
right behind the first id, then shared hit buffers would be quite high (and
the query would run much faster); this however never works for the small
table.

Thanks   a lot!

Best, Marco


Results for the small table: it took 141 seconds to finish.  The planning
time is 85256.31

"Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
time=141419.499..141420.025 rows=20288 loops=1)"
"  Sort Key: "time""
"  Sort Method: quicksort  Memory: 3622kB"
"  Buffers: shared hit=92 read=19816"
"  ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27 rows=22101
width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
"Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
"Heap Blocks: exact=19826"
"Buffers: shared hit=92 read=19816"
"->  Bitmap Index Scan on data2013_01w_ixtaxiid  (cost=0.00..510.33
rows=22101 width=0) (actual time=26.053..26.053 rows=20288 loops=1)"
"  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
"  Buffers: shared hit=4 read=78"
"Planning time: 0.144 ms"
"Execution time: 141421.154 ms"

Results for the large table: it took 5 seconds to finish.  The planning
time is 252077.10
"Sort  (cost=251913.32..252077.10 rows=65512 width=55) (actual
time=5038.571..5039.765 rows=44204 loops=1)"
"  Sort Key: "time""
"  Sort Method: quicksort  Memory: 7753kB"
"  Buffers: shared hit=2 read=7543"
"  ->  Bitmap Heap Scan on data2011_01  (cost=1520.29..246672.53 rows=65512
width=55) (actual time=36.935..5017.463 rows=44204 loops=1)"
"Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
"Heap Blocks: exact=7372"
"Buffers: shared hit=2 read=7543"
"->  Bitmap Index Scan on data2011_01_ixtaxiid  (cost=0.00..1503.92
rows=65512 width=0) (actual time=35.792..35.792 rows=44204 loops=1)"
"  Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
"  Buffers: shared hit=2 read=171"
"Planning time: 0.127 ms"
"Execution time: 5042.134 ms"


Re: [GENERAL] BDR replication on Postgresql 9.5.0

2016-02-01 Thread Kaushal Shriyan
On Mon, Feb 1, 2016 at 12:24 AM, John R Pierce  wrote:

> On 1/29/2016 3:00 AM, Kaushal Shriyan wrote:
>
>>
>> Do i need to install any BDR specific package to enable it in postgresql
>> 9.5 version. While reading
>> http://bdr-project.org/docs/0.9.0/install-requirements.html i assumed
>> that it is available in 9.5 version by default without using any patches.
>> Please comment.
>>
>
> "As of the time of writing, the upcoming PostgreSQL 9.5 release is not yet
> supported. Neither is Microsoft Windows. Support for both will be added in
> later releases; please check the BDR website for the latest information."
>
>
>
Thanks Pierce for the explanation.


Re: [GENERAL] Is PRIMARY KEY the same as UNIQUE NOT NULL?

2016-02-01 Thread Geoff Winkless
On 31 January 2016 at 19:53, David G. Johnston
 wrote:
> A PRIMARY KEY enforces a UNIQUE, NOT NULL constraint and additionally allows
[snip]

I would just remove the whole paragraph. A primary key does what it
does, a unique constraint does what it does. I'm not really sure why
you need to link them.

I would just start with

"A primary key indicates that a column or group of columns can be used
as a unique identifier for rows in the table."

before the examples, then remove the "Technically" and the whole
parenthesised comment about unique keys and nulls, so the next line
after the examples becomes

"Primary keys are useful both for documentation purposes and for
client applications. For example, a GUI application that allows
modifying row values probably needs to know the primary key of a table
to be able to identify rows uniquely."

I just think it's unnecessarily confusing to start suggesting that
there's some equivalency when you then need to clarify that actually
they're not really equivalent.

Geoff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PL/pgSQL debugger

2016-02-01 Thread Albe Laurenz
Dane Foster wrote:
> I googled "plpgsql debugger" and eventually ended up at 
> http://pgfoundry.org/projects/edb-debugger/,
> where the latest release is almost 8 years old. I am aware that this is not 
> an EDB forum but given
> that I sent them an e-mail at 9AM EST and I haven't received a response yet I 
> figured it can't hurt to
> try my question here. Is anyone using it on recent (e.g., 9.5) versions of 
> PostgreSQL. If not, is
> there any consensus on what one should be using to debug PL/pgSQL functions?

The current version lives here:
http://git.postgresql.org/gitweb/?p=pldebugger.git

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general