Re: [GENERAL] Increase in max_connections

2014-03-14 Thread Anand Kumar, Karthik
For anyone that's still following - we tried upgrading to postgres 9.3.3 -
that hasn't helped.

Running an strace on the pid that was consuming the highest CPU at the
time of the outage shows:

semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91881569, {{12, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(89325587, {{14, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(90964037, {{4, 1, 0}}, 1) = 0
semop(90308657, {{5, 1, 0}}, 1) = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(88866821, {{12, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(90439733, {{13, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(90341426, {{2, 1, 0}}, 1) = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(90308657, {{5, 1, 0}}, 1) = 0
semop(91881569, {{12, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(88866821, {{12, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91881569, {{12, 1, 0}}, 1)= 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(90865730, {{5, 1, 0}}, 1) = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(91521110, {{12, -1, 0}}, 1)   = 0
semop(90865730, {{5, 1, 0}}, 1) = 0

I've seen other people talk of this problem with a lot of semop calls,
haven't yet found a clear solution.
Anyone have any ideas?

I've also downloaded the perf tool based on
http://rhaas.blogspot.com/2012/06/perf-good-bad-ugly.html - will see what
that has to show.

Thanks,
Karthik


On 3/11/14 1:06 PM, "John R Pierce"  wrote:

>On 3/11/2014 10:20 AM, Anand Kumar, Karthik wrote:
>> We typically see about 500-700 active queries at a time
>
>if these are primarily small/fast queries, like OLTP operations, and you
>DONT have 200-400 CPU cores on this server, you will likely find that if
>you use a queueing mechanism to only execute about 2X your CPU core
>count concurrently, you will get MORE total transactions/second than
>trying to do 500-700 at once.
>
>if your apps are using persistent connections, then the session pooling
>model won't do any good, you should use transaction pooling.  you want
>the actual active query count to be tunable, probably down around 2X the
>cpu core count, depending on various things.some folks say, CPU
>cores/threads plus disk spindles is the optimal number.
>
>
>
>-- 
>john r pierce  37N 122W
>somewhere on the middle of the left coast
>
>
>
>-- 
>Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Preventing GIN fastupdate from slowing down normal queries

2014-03-14 Thread Zev Benjamin

Hi all,

I'm running PostgreSQL 9.1 on a fairly beefy server with a lot of RAM, 
so I generally want work_mem set pretty high.  One of my tables has a 
GIN index, and, as a consequence of the high work_mem setting, its 
fastupdate pending list can grow very large.  This leads to the 
occasional INSERT or UPDATE taking inordinately long as the pending list 
is flushed.


My thinking was that I could prevent the list from growing too large by 
setting the autovacuum storage parameters on the table such that the 
autovacuum process will run after a reasonable number of INSERTs or 
UPDATEs.  However, the table is mostly INSERT-only.  Therefore, only the 
autovacuum_analyze_* parameters will actually do anything to affect when 
the autovacuumer will run, but when it does, it will do a VACUUM ANALYZE 
instead of a plain VACUUM.  I don't particularly need the table to be 
re-analyzed that often.


Are there any other good solutions for preventing the fastupdate 
mechanism from impacting normal queries?  I would also consider just 
turning fastupdate off, but my understanding is that can lead to the 
index becoming less efficient.  Is there a way to set work_mem low just 
for the purpose of sizing the pending list?



Thanks,
Zev


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] High Level Committers Wanted

2014-03-14 Thread Andy Colson

On 3/14/2014 6:08 AM, Antman, Jason (CMG-Atlanta) wrote:

I'm not a "high level committer", nor am I even a regular poster to this
list.

not saying this post is true, but... If I'm reading between the lines
correctly, this could make for quite a striking headline in the tech
news - "NSA dumps Oracle proprietary database in favor of PostgreSQL."



No no.   bobJobS is from Oracle.  They are tying 
to lure all the PG developers to a secret meeting  ...  where they will 
all disappear.  Problem solved. :-)


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can't restart statistics collection and autovacuum

2014-03-14 Thread Ignacio Colmenero

Thanks Tom.
The database had been up for 6 months now with no problems at all so, 
according to your comment, it was a change that became current when the 
server was rebooted. I need someone to help me out with the server rules 
to find out what's wrong (or different) with that.
Regarding the server reboot, our server is hosted by Rackspace and, for 
some reason it ran out of memory, and they rebooted it; that's something 
I also need to check out, but that's for another post.


Thanks.

On 14/03/2014 1:13 PM, Tom Lane wrote:

Ignacio Colmenero  writes:

Yesterday, the database server was rebooted and database wasn't shut
down properly.
When it went back online, the following messages showed up:
2014-03-13 08:46:08 UTC   LOG:  08006: test message did not get through
on socket for statistics collector

The only known reason for that to happen is kernel packet filtering rules
that prevent passing traffic through the stats-collector socket.  It seems
very unlikely that this "just happened" without somebody/something outside
Postgres actively breaking it.  I'd take a close look at recent system
configuration changes, particularly in the firewall area.  It wouldn't
hurt to know exactly *why* the system was rebooted, too.

regards, tom lane



--

/Ignacio Colmenero/
Database Analyst
Abra Controls Inc.
P# (403) 775-2237
ignacio.colmen...@abracontrols.com 





Re: [GENERAL] Can't restart statistics collection and autovacuum

2014-03-14 Thread Tom Lane
Ignacio Colmenero  writes:
> Yesterday, the database server was rebooted and database wasn't shut 
> down properly.
> When it went back online, the following messages showed up:
> 2014-03-13 08:46:08 UTC   LOG:  08006: test message did not get through 
> on socket for statistics collector

The only known reason for that to happen is kernel packet filtering rules
that prevent passing traffic through the stats-collector socket.  It seems
very unlikely that this "just happened" without somebody/something outside
Postgres actively breaking it.  I'd take a close look at recent system
configuration changes, particularly in the firewall area.  It wouldn't
hurt to know exactly *why* the system was rebooted, too.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Can't restart statistics collection and autovacuum

2014-03-14 Thread Ignacio Colmenero

Hi Guys.
Yesterday, the database server was rebooted and database wasn't shut 
down properly.

When it went back online, the following messages showed up:
2014-03-13 08:46:08 UTC   LOG:  08006: test message did not get through 
on socket for statistics collector

2014-03-13 08:46:08 UTC   LOCATION:  pgstat_init, pgstat.c:463
2014-03-13 08:46:08 UTC   LOG:  0: disabling statistics collector 
for lack of working socket

2014-03-13 08:46:08 UTC   LOCATION:  pgstat_init, pgstat.c:521
2014-03-13 08:46:08 UTC   WARNING:  01000: autovacuum not started 
because of misconfiguration

2014-03-13 08:46:08 UTC   HINT:  Enable the "track_counts" option.
2014-03-13 08:46:08 UTC   LOCATION:  autovac_init, autovacuum.c:2809
2014-03-13 08:46:08 UTC   LOG:  0: database system was interrupted; 
last known up at 2014-03-13 08:41:49 UTC


I enabled the track_counts option, but when I checked it, it still shows 
as "off".

I have 2 questions:
1. Any ideas how to fix this? I've googled it and check this list and 
tried everything so far with no results.
2. Any link to a script I may post on rc6.d to avoid this mishap in the 
future?


PostgreSQL version is 9.1.3
OS:  Ubuntu 11.10 (GNU/Linux 3.0.0-12-virtual x86_64)
--

/Ignacio Colmenero/
Database Analyst
Abra Controls Inc.
P# (403) 775-2237
ignacio.colmen...@abracontrols.com 





Re: [GENERAL] Cannot insert to 'path' field using EclipseLink

2014-03-14 Thread Daryl Foster
Turns out that it was a JBoss problem. I had to remove the driver jar from
the deployed EAR file and set a dependency to the module in the JBoss
server.

This link has the details: https://community.jboss.org/message/862434

Thanks for pointing me in the right direction.


On Wed, Mar 12, 2014 at 1:23 PM, Rob Sargent  wrote:

> On 03/12/2014 11:12 AM, Daryl Foster wrote:
>
>> I've manually entered some data into table to see if I could successfully
>> pull data from the table using EclipseLink. Now the
>> convertDataValueToObjectValue method in my Converter throws the following
>> exception:
>>
>> java.lang.ClassCastException: org.postgresql.geometric.PGpath cannot be
>> cast to org.postgresql.geometric.PGpath
>>
>>  recompile everything with the same version of java? double check
> dependencies?
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 

Sincerely,

*Daryl Foster*
*Software Engineer*

On Center Software, Inc.

*P:* 281.210.0177
*F:* 281.297.9001
*W:* *www.oncenter.com *


Re: [GENERAL] pg_dump fails pg_rewrite entry not found

2014-03-14 Thread Jakub Can
I am terribly sorry. I was sinked in this for about 2 days and just a
few minutes after i posted this a was told by a 3rd person that I am
using datbase postgres instead of eb3_nz to searched for pending OID in
pg_rewrite. So the common delete from pg_rewrite where oid = 1001837 worked.
Thanks a lot.

On 03/14/2014 03:11 PM, Adrian Klaver wrote:
> On 03/14/2014 06:56 AM, Jakub Can wrote:
>> Hello, our database suddenly went broken somehow. We still dont know if
>> it is becouse of hw failure etc., anyway, when I try to make dump using
>> pg_dump or pg_dumpall I have got error message like this:
>>
>> pg_dump: failed sanity check, parent table OID 1001834 of pg_rewrite
>> entry OID 1001837 not found
>> pg_dumpall: pg_dump failed on database "eb3_nz", exiting
>>
>> I have tried to search for such an error and have found a lot of
>> solution suggestions, but nothing worked for me, because I did not found
>> any record in pg_class, or pg_rewrite, etc. referencing to those OIDs
>> 1001834 or 1001837 . The version of postgres is 9.2.2
>>
>> Please is there any other place, where to look for the error?
>
> So 'select * from pg_rewrite where oid = 1001837' finds nothing?
>
>>
>> Thank you in advance,
>> Jakub.
>>
>>
>
>



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] XML validation of whitespace values

2014-03-14 Thread Tim Kane
HI all,

I’ve had an interesting problem trying to perform an UPDATE based on the
results of processing an xpath against a field of type xml.
Specifically, my query would fail with:

> ERROR:  could not parse XML document
> DETAIL:  line 1: Start tag expected, '<' not found


I thought this strange, since the xml type is (I believe) supposed to
guarantee that the value is well formed (wether it’s a document or content).

After some amount of digging to locate the rogue entry, it turns out the
field value is actually a single whitespace character – which apparently is
well formed..

Thus I’m left with the following situation…



clone=# select xml_is_well_formed(' ');
 xml_is_well_formed

 t
(1 row)


clone=# select xpath_exists (‘//test', ' ');
ERROR:  could not parse XML document
DETAIL:  line 1: Start tag expected, '<' not found



clone=# create temp table xml_test (document xml);
CREATE TABLE
clone=# insert into xml_test values (' ');
INSERT 0 1
clone=# select xml_is_well_formed(document::text) from xml_test;
 xml_is_well_formed

 t
(1 row)

clone=# select xpath_exists ('test', document) from xml_test;
ERROR:  could not parse XML document
DETAIL:  line 1: Start tag expected, '<' not found



This is confusing, since… if the xml is well formed – it probably shouldn’t
cause the parser to error.  And if it isn’t well formed, then the
instantiation of the value as an xml type should fail.





Re: [GENERAL] named queries and the wire protocol

2014-03-14 Thread Tom Lane
David Welton  writes:
> We tracked down the commit that introduced the automatically generated
> prepared statement names:

> https://github.com/epgsql/epgsql/commit/dabf972f74735d2

> The author wrote "Usage of unnamed prepared statement and portals
> leads to unpredictable results in case of concurrent access to same
> connection."

Um ... in general, concurrent use of a single PG connection by multiple
threads will not work, period.  Neither the server nor the wire protocol
definition are designed for that.

Now, you can have multiple portals open and fetch a few rows at a time
from each one, if you're careful to serialize the fetch operations;
but TBH such usage is a niche case.  It's possible this is of use
depending on the details of the API your driver exposes to applications,
but I don't have those details.

If there's not locking in your driver that restricts concurrent access to
the connection, then use of named rather than unnamed statements and
portals isn't going to fix that.

> For my own clarification, going by
> http://www.postgresql.org/docs/devel/static/protocol-overview.html -
> the named statement has no parameters - it's just a parsed statement,
> whereas a portal is a statement subsequently bound to some parameters?

A prepared statement is a static object created by a Parse protocol
command.  The Bind command creates a portal (a query execution state) from
a statement, and must supply values for any parameters the statement has.
(Too lazy to double-check, but I think either a named or an unnamed
statement can have parameters.)  Then you say Execute to run the portal.

The main difference between named and unnamed statements/portals is that
the protocol details are designed to simplify one-shot use of the unnamed
statement and portal, for instance the provision for implicit Close of the
old unnamed statement or portal if you just go and create a new one.  Also
the server is optimized to expect a single use of an unnamed statement vs
multiple uses of named statements.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump fails pg_rewrite entry not found

2014-03-14 Thread Adrian Klaver

On 03/14/2014 06:56 AM, Jakub Can wrote:

Hello, our database suddenly went broken somehow. We still dont know if
it is becouse of hw failure etc., anyway, when I try to make dump using
pg_dump or pg_dumpall I have got error message like this:

pg_dump: failed sanity check, parent table OID 1001834 of pg_rewrite
entry OID 1001837 not found
pg_dumpall: pg_dump failed on database "eb3_nz", exiting

I have tried to search for such an error and have found a lot of
solution suggestions, but nothing worked for me, because I did not found
any record in pg_class, or pg_rewrite, etc. referencing to those OIDs
1001834 or 1001837 . The version of postgres is 9.2.2

Please is there any other place, where to look for the error?


So 'select * from pg_rewrite where oid = 1001837' finds nothing?



Thank you in advance,
Jakub.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_dump fails pg_rewrite entry not found

2014-03-14 Thread Jakub Can
Hello, our database suddenly went broken somehow. We still dont know if
it is becouse of hw failure etc., anyway, when I try to make dump using
pg_dump or pg_dumpall I have got error message like this:

pg_dump: failed sanity check, parent table OID 1001834 of pg_rewrite
entry OID 1001837 not found
pg_dumpall: pg_dump failed on database "eb3_nz", exiting

I have tried to search for such an error and have found a lot of
solution suggestions, but nothing worked for me, because I did not found
any record in pg_class, or pg_rewrite, etc. referencing to those OIDs
1001834 or 1001837 . The version of postgres is 9.2.2

Please is there any other place, where to look for the error?

Thank you in advance,
Jakub.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] puzzling perl DBI vs psql problem

2014-03-14 Thread MOLINA BRAVO FELIPE DE JESUS

De: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] en 
nombre de Susan Cassidy [susan.cass...@decisionsciencescorp.com]
Enviado: jueves, 13 de marzo de 2014 02:48 p.m.
Para: Rodrigo Gonzalez
CC: Steve Atkins; pgsql-general@postgresql.org
Asunto: Re: [GENERAL] puzzling perl DBI vs psql problem

No, I don't have 2 instances running.  I default the port on the psql command 
line, and the perl program is using 5432, as normal.

Now, I'm discovering that syslog is no longer logging anything.  I bounced it, 
but to no avail.

Susan


Check  the log from webserver  and check if the user is the same for psql 
and perl ...

see you



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: Is PostgreSQL 9.3 using indexes for pipelined top-N window function queries?

2014-03-14 Thread Thomas Kellerer
Markus,

thanks for your reply.

> The pipelined top-n query has two very important properties: 
> (1) it utilizes the index order to avoid the sort operation required to 
> satisfy ORDER BY clause 
> (2) it realizes that it can stop processing as soon as it has delivered 
> enough rows.
> 
> The execution plan from Thomas Kellerer sees to fulfill requirement
> (1) but definitively not (2).
> 
> Even with 9.3.2, I were not able to reproduce the result of Thomas
> (not showing any sort operation in the execution plan) with the test
> data I also published at my website: 
> http://use-the-index-luke.com/sql/example-schema/postgresql/partial-results
>
> Then I started fiddling around with the planner's cost settings and
> finally managed to get a plan similar to Thomas' when setting
> random_page_cost to 0.1 (setting it to 1, equal to seq_page_cost was
> not enough). However, that proves the point that PostgreSQL can use
> an index to avoid the sort operation caused by order by (even for
> window functions). I'd be curious what settings caused Thomas to get
> this result.

Good point, I should adopt the habit to mention config settings for this kind 
of things:

shared_buffers = 2048MB
temp_buffers = 16MB
work_mem = 32MB
seq_page_cost = 1.0
random_page_cost = 1.5
cpu_tuple_cost = 0.001
cpu_index_tuple_cost = 0.001
effective_cache_size = 2048MB

> The second requirement for pipelined top-n queries is not satisfied
> in Thomas' execution plan: it does read the full index (actual
> rows=100), and applies the window function over all rows. Only in
> the end it throws away all non-confirming rows (Rows Removed by
> Filter: 999899). A pipelined top-n execution would not cause more
> than 300 rows read from the index, and only 200 rows removed by
> filter. That's what the Oracle DB and SQL Server manage to do
> (Thomas: ping me to track down why Oracle didn't for you).
> Considering that PG can use the index order to avoid the sort, it
> still doesn't make very much sense if it cannot abort the index scan
> after fetching enough rows. So, not using the index might even be
> right choice in unfiltered cases like this.

I probably should have read the definition of "index usage" more carefully, 
thanks for the clarification.


I created the testdata from your webpage locally and then ran the window 
function statement from
http://use-the-index-luke.com/sql/example-schema/postgresql/partial-results

SELECT *
  FROM ( SELECT sales.*
  , ROW_NUMBER() OVER (ORDER BY sale_date DESC
  , sale_id   DESC) rn
   FROM sales
   ) tmp
 WHERE rn between 11 and 20
 ORDER BY sale_date DESC, sale_id DESC;

This does an "Index Scan Backward"  on an index defined as (sale_date, sale_id) 
but takes nearly 2.5 seconds on my computer. 

The version using OFFSET .. LIMIT took about 0.05 seconds and increases when 
the offset increases which is to be expected - whereas the window function 
version is pretty much constant even with a startpoint of 11 - but the 
offset version is still *much* faster then.

Regards
Thomas
 
P.S.: Btw: thanks for your book and your webpage, both are very insipring.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is PostgreSQL 9.3 using indexes for pipelined top-N window function queries?

2014-03-14 Thread Pavel Stehule
2014-03-14 13:02 GMT+01:00 Markus Winand :

> Hi!
>
> I'd like to clarify this as the original author of the page in question.
>
> Fist of all, I also recommend the row-value syntax as you can see on the
> "previous" page:
> http://use-the-index-luke.com/sql/partial-results/fetch-next-page
>
> I've also explained this procedure at conferences. Here are the slides:
>
> http://use-the-index-luke.com/blog/2013-07/pagination-done-the-postgresql-way
>
>
> And now about this quote:
> > However, only SQL Server and the Oracle database can use them for a
> pipelined top-N query. PostgreSQL does not use indexes for those queries
> and therefore executes them very inefficiently.
>
> The very important thing here is "pipelined top-N query". This term is
> introduced two pages earlier:
> http://use-the-index-luke.com/sql/partial-results/top-n-queries
>
> The pipelined top-n query has two very important properties:
> (1) it utilizes the index order to avoid the sort operation required to
> satisfy ORDER BY clause
> (2) it realizes that it can stop processing as soon as it has delivered
> enough rows.
>
> The execution plan from Thomas Kellerer sees to fulfill requirement (1)
> but definitively not (2).
>
> Even with 9.3.2, I were not able to reproduce the result of Thomas (not
> showing any sort operation in the execution plan) with the test data I also
> published at my website:
>
> http://use-the-index-luke.com/sql/example-schema/postgresql/partial-results
>
> Then I started fiddling around with the planner's cost settings and
> finally managed to get a plan similar to Thomas' when setting
> random_page_cost to 0.1 (setting it to 1, equal to seq_page_cost was not
> enough). However, that proves the point that PostgreSQL can use an index to
> avoid the sort operation caused by order by (even for window functions).
> I'd be curious what settings caused Thomas to get this result.
>
> The second requirement for pipelined top-n queries is not satisfied in
> Thomas' execution plan: it does read the full index (actual rows=100),
> and applies the window function over all rows. Only in the end it throws
> away all non-confirming rows (Rows Removed by Filter: 999899). A pipelined
> top-n execution would not cause more than 300 rows read from the index, and
> only 200 rows removed by filter. That's what the Oracle DB and SQL Server
> manage to do (Thomas: ping me to track down why Oracle didn't for you).
> Considering that PG can use the index order to avoid the sort, it still
> doesn't make very much sense if it cannot abort the index scan after
> fetching enough rows. So, not using the index might even be right choice in
> unfiltered cases like this.
>
> Interestingly, I did not even get an Index Only Scan when just selecting
> column from the index with random_page_cost=1 and seq_page_cost=1. Again I
> had to reduce random_page_cost further down (0.1) to get an Index Only
> Scan. That does not make sense to me. When both _page_costs are same, the
> Index Only Scan should get lower costs because the index is smaller (338mb
> vs. 31 mb in my case). On top of that, the Index Only Scan avoids the sort.
> However, with 9.3.2 I get the same cost for Index Only Scan as for Index
> Scan (had to enable_seqscan=off and enable_bitmapscan=off to get that).
>
> So, I have to change my page  (&book) to say something like this:
>
> > PostgreSQL does not abort the index scan after fetching enough rows for
> those queries and therefore executes them very inefficiently.
>

I am thinking so LIMIT is not propagated to down - so window function
should be calculated in full range.

Regards

Pavel


>
>
> Thanks for the hint and always feel free to put my on CC regarding
> questions about stuff on Use The Index, Luke!
>
> -markus
>
> ps.: It's perfectly possible that PG could use indexes for
> window-functions before 9.3. I did definitively not fiddle around with cost
> settings at that time to force it into this plan.
> pps.: sorry for the delay, I'm not subscribed (just too much) but somebody
> was nice enough to ping me about this.
> ppps: then I wondered how to properly reply without having the original
> messages. So I downloaded the .mbox from the archive and pushed reply
> there. Hope it ends up in the right thread :)
>
> Markus Winand
> markus.win...@winand.at
> T +43 1 9444047
>
> > "A wonderful book…I highly recommend it." -Anders Janmyr
> > http://sql-performance-explained.com/
>
> Maderspergerstr. 1-3/9/11
> 1160 Wien
> AUSTRIA
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Is PostgreSQL 9.3 using indexes for pipelined top-N window function queries?

2014-03-14 Thread Markus Winand
Hi!

I'd like to clarify this as the original author of the page in question.

Fist of all, I also recommend the row-value syntax as you can see on the 
"previous" page:
http://use-the-index-luke.com/sql/partial-results/fetch-next-page

I've also explained this procedure at conferences. Here are the slides:
http://use-the-index-luke.com/blog/2013-07/pagination-done-the-postgresql-way


And now about this quote:
> However, only SQL Server and the Oracle database can use them for a pipelined 
> top-N query. PostgreSQL does not use indexes for those queries and therefore 
> executes them very inefficiently.

The very important thing here is "pipelined top-N query". This term is 
introduced two pages earlier:
http://use-the-index-luke.com/sql/partial-results/top-n-queries

The pipelined top-n query has two very important properties:
(1) it utilizes the index order to avoid the sort operation required to satisfy 
ORDER BY clause
(2) it realizes that it can stop processing as soon as it has delivered enough 
rows.

The execution plan from Thomas Kellerer sees to fulfill requirement (1) but 
definitively not (2).

Even with 9.3.2, I were not able to reproduce the result of Thomas (not showing 
any sort operation in the execution plan) with the test data I also published 
at my website:
   http://use-the-index-luke.com/sql/example-schema/postgresql/partial-results

Then I started fiddling around with the planner's cost settings and finally 
managed to get a plan similar to Thomas' when setting random_page_cost to 0.1 
(setting it to 1, equal to seq_page_cost was not enough). However, that proves 
the point that PostgreSQL can use an index to avoid the sort operation caused 
by order by (even for window functions). I'd be curious what settings caused 
Thomas to get this result.

The second requirement for pipelined top-n queries is not satisfied in Thomas' 
execution plan: it does read the full index (actual rows=100), and applies 
the window function over all rows. Only in the end it throws away all 
non-confirming rows (Rows Removed by Filter: 999899). A pipelined top-n 
execution would not cause more than 300 rows read from the index, and only 200 
rows removed by filter. That's what the Oracle DB and SQL Server manage to do 
(Thomas: ping me to track down why Oracle didn't for you). Considering that PG 
can use the index order to avoid the sort, it still doesn't make very much 
sense if it cannot abort the index scan after fetching enough rows. So, not 
using the index might even be right choice in unfiltered cases like this.

Interestingly, I did not even get an Index Only Scan when just selecting column 
from the index with random_page_cost=1 and seq_page_cost=1. Again I had to 
reduce random_page_cost further down (0.1) to get an Index Only Scan. That does 
not make sense to me. When both _page_costs are same, the Index Only Scan 
should get lower costs because the index is smaller (338mb vs. 31 mb in my 
case). On top of that, the Index Only Scan avoids the sort. However, with 9.3.2 
I get the same cost for Index Only Scan as for Index Scan (had to 
enable_seqscan=off and enable_bitmapscan=off to get that).

So, I have to change my page  (&book) to say something like this:

> PostgreSQL does not abort the index scan after fetching enough rows for those 
> queries and therefore executes them very inefficiently.


Thanks for the hint and always feel free to put my on CC regarding questions 
about stuff on Use The Index, Luke!

-markus

ps.: It's perfectly possible that PG could use indexes for window-functions 
before 9.3. I did definitively not fiddle around with cost settings at that 
time to force it into this plan.
pps.: sorry for the delay, I'm not subscribed (just too much) but somebody was 
nice enough to ping me about this.
ppps: then I wondered how to properly reply without having the original 
messages. So I downloaded the .mbox from the archive and pushed reply there. 
Hope it ends up in the right thread :)

Markus Winand
markus.win...@winand.at
T +43 1 9444047

> "A wonderful book…I highly recommend it." -Anders Janmyr
> http://sql-performance-explained.com/

Maderspergerstr. 1-3/9/11
1160 Wien
AUSTRIA


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] High Level Committers Wanted

2014-03-14 Thread Antman, Jason (CMG-Atlanta)
I'm not a "high level committer", nor am I even a regular poster to this 
list.

not saying this post is true, but... If I'm reading between the lines 
correctly, this could make for quite a striking headline in the tech 
news - "NSA dumps Oracle proprietary database in favor of PostgreSQL."

I'd venture a guess that, whether he's legitimate or not, "bobJobS" 
mentioned "the Columbia area" referring to Columbia, MD, USA. The two 
biggest employers in the area that I can think of are the NSA and a 
bunch of DoD facilities... once again, not saying this is legitimate, 
but they are the types of operations that go for cloak-and-dagger "we 
have something interesting to talk to you about, why don't you come down 
and meet some of our people in a hotel and sign reams of NDAs before we 
tell you who we are" offers.

On 03/12/2014 01:13 PM, bobJobS wrote:
> Understood and I agree.
>
> I will check to see how much contact detail I can divulge in an open forum.
>
>
>
> --
> View this message in context: 
> http://postgresql.1045698.n5.nabble.com/High-Level-Committers-Wanted-tp5795695p5795729.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>


-- 

Jason Antman | Systems Engineer | CMGdigital
jason.ant...@coxinc.com | p: 678-645-4155


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] named queries and the wire protocol

2014-03-14 Thread David Welton
Hi,

On Thu, Mar 13, 2014 at 1:51 AM, Tom Lane  wrote:
> David Welton  writes:
>>> send(State, ?BIND, ["", 0, StatementName, 0, Bin1, Bin2]),
>>> send(State, ?EXECUTE, ["", 0, <<0:?int32>>]),
>>> send(State, ?CLOSE, [?PREPARED_STATEMENT, StatementName, 0]),
>>> send(State, ?SYNC, []),
>
>> And then the code above.  So it's generating a name itself and then
>> destroying it once the query is done.
>> Perhaps this behavior is not a good idea and using the unnamed portal
>> would be a better idea?

> My point is that it *is* using the unnamed portal, AFAICS --- the ""s
> in the Bind and Execute commands appear to correspond to the empty
> strings that would select that portal.

Ok, yes, that makes sense.

> The Close on the other hand is specifying closing a prepared statement,
> not a portal.  If you're right about the control flow around this
> function, then the code is generating a prepared statement, using it
> once, and destroying it.  Which is dumb; you should instead use the
> unnamed-statement protocol flow, which is better optimized for that
> usage pattern.

We tracked down the commit that introduced the automatically generated
prepared statement names:

https://github.com/epgsql/epgsql/commit/dabf972f74735d2

The author wrote "Usage of unnamed prepared statement and portals
leads to unpredictable results in case of concurrent access to same
connection."

For my own clarification, going by
http://www.postgresql.org/docs/devel/static/protocol-overview.html -
the named statement has no parameters - it's just a parsed statement,
whereas a portal is a statement subsequently bound to some parameters?

Can you or someone speak to the concurrency issues?

A big thanks for taking the time to go over this with me,
-- 
David N. Welton

http://www.dedasys.com/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general