Re: [PERFORM] Correct use of cursors for very large result sets in Postgres

2017-02-22 Thread Vitalii Tymchyshyn
For JDBC there are certain prerequisites for setFetchSize to work, e.g.
using forward only result sets and transactions.

вт, 21 лют. 2017 о 09:06 John Gorman  пише:

> My experience with cursors in PostgreSQL with Java has been to stay away
> from them. We support 2 databases with our product, PostgreSQL (default)
> and SQL Server. While re-encrypting data in a database the application used
> cursors with a fetch size of 1000.
>
>
>
> Worked perfectly on SQL Server and on PostgreSQL until we got to a
> PostgreSQL table with more than 11 million rows. After spending weeks
> trying to figure out what was happening, I realized that when it gets to a
> table with more than 10 million rows for some reason, the cursor
> functionality just silently stopped working and it was reading the entire
> table. I asked another very senior architect to look at it and he came to
> the same conclusion. Because of limited time, I ended up working around it
> using limit/offset.
>
>
>
> Again we are using Java, so the problem could just be in the PostgreSQL
> JDBC driver. Also we were on 9.1 at the time.
>
>
>
> Regards
>
> John
>
>
>
> *From:* pgsql-performance-ow...@postgresql.org [mailto:
> pgsql-performance-ow...@postgresql.org] *On Behalf Of *Mike Beaton
> *Sent:* Tuesday, February 21, 2017 6:49 AM
> *To:* pgsql-performance@postgresql.org
> *Subject:* Re: [PERFORM] Correct use of cursors for very large result
> sets in Postgres
>
>
>
> Thanks, Tom.
>
> Wouldn't this mean that cursors are noticeably non-optimal even for normal
> data sizes, since the entire data to be streamed from the table is always
> duplicated into another buffer and then streamed?
>
>
>
> > if you want the whole query result at once, why are you bothering with
> a cursor?
>
>
>
> The PostgreSQL docs (
> https://www.postgresql.org/docs/9.6/static/plpgsql-cursors.html#AEN66551) 
> clearly
> recommend cursors as a way to return a reference to a large result set from
> a function (as I understood, this is recommended precisely as a way to
> avoid tuple-based buffering of the data).
>
>
>
> So following that advice, it's not unreasonable that I would actually have
> a cursor to a large dataset.
>
>
>
> Then, I would ideally want to be able to fetch the data from that cursor
> without the entire data getting duplicated (even if only a bit at a time
> instead of all at once, which seems to be the best case behaviour) as I go.
>
>
>
> Additionally, I thought that if I had a streaming use-case (which I do),
> and a streaming data-access layer (which I do), then since `SELECT * FROM
> large` is absolutely fine, end-to-end, in that situation, then by symmetry
> and the principle of least astonishment `FETCH ALL FROM cursor` might be
> fine too.
>
>
>


Re: [PERFORM] pgsql connection timeone

2017-02-04 Thread Vitalii Tymchyshyn
Well, you  can try switching to urandom, see
http://stackoverflow.com/questions/137212/how-to-solve-performance-problem-with-java-securerandom,
second answer.

сб, 4 лют. 2017 о 11:45 Vucomir Ianculov <vuko...@os-ux.com> пише:

> Hi Vitalii,
>
> no, hove can i check it? searched but did not find any useful information .
>
> Thanks,
>
> Br,
> Vuko
> ----------
> *From: *"Vitalii Tymchyshyn" <v...@tym.im>
> *To: *"Vucomir Ianculov" <vuko...@os-ux.com>, "Tom Lane" <
> t...@sss.pgh.pa.us>
> *Cc: *pgsql-performance@postgresql.org
> *Sent: *Wednesday, February 1, 2017 7:11:12 PM
>
> *Subject: *Re: [PERFORM] pgsql connection timeone
>
> Just a wild guess, but did you check your random source? We had similar
> problems in Oracle and had to switch to /dev/urandom. It can be done with a
> system variable setting.
>
> On Wed, Feb 1, 2017, 7:52 AM Vucomir Ianculov <vuko...@os-ux.com> wrote:
>
> can anyone help me with my problem?
> i'm really don't know from when the problem can be.
>
>
>
> --
> *From: *"Vucomir Ianculov" <vuko...@os-ux.com>
> *To: *"Tom Lane" <t...@sss.pgh.pa.us>
> *Cc: *pgsql-performance@postgresql.org
> *Sent: *Saturday, January 28, 2017 12:03:55 PM
>
> *Subject: *Re: [PERFORM] pgsql connection timeone
>
> Hi Tom,
>
> this is the entry from pg_hba.conf
> host all all 0.0.0.0/0   md5
>
> i needed to restart postgres service to be able to accept new connection,
> witch it strange becouse there was no load on the server and it head a lot
> of free ram.
>
>
>
>
> --
> *From: *"Tom Lane" <t...@sss.pgh.pa.us>
> *To: *"Vucomir Ianculov" <vuko...@os-ux.com>
> *Cc: *pgsql-performance@postgresql.org
> *Sent: *Wednesday, January 25, 2017 3:15:28 PM
> *Subject: *Re: [PERFORM] pgsql connection timeone
>
> Vucomir Ianculov <vuko...@os-ux.com> writes:
> > i'm seeing a lot of connection time out in postgresql log
>
> > 2017-01-25 11:09:47 EET [6897-1] XXX@YYY FATAL: canceling
> authentication due to timeout
> > 2017-01-25 11:10:15 EET [6901-1] XXX@YYY FATAL: canceling
> authentication due to timeout
> > 2017-01-25 11:10:17 EET [6902-1] xxx@YYY FATAL: canceling
> authentication due to timeout
>
> So ... what authentication method are you using?
>
> regards, tom lane
>
>


Re: [PERFORM] pgsql connection timeone

2017-02-01 Thread Vitalii Tymchyshyn
Just a wild guess, but did you check your random source? We had similar
problems in Oracle and had to switch to /dev/urandom. It can be done with a
system variable setting.

On Wed, Feb 1, 2017, 7:52 AM Vucomir Ianculov  wrote:

> can anyone help me with my problem?
> i'm really don't know from when the problem can be.
>
>
>
> --
> *From: *"Vucomir Ianculov" 
> *To: *"Tom Lane" 
> *Cc: *pgsql-performance@postgresql.org
> *Sent: *Saturday, January 28, 2017 12:03:55 PM
>
> *Subject: *Re: [PERFORM] pgsql connection timeone
>
> Hi Tom,
>
> this is the entry from pg_hba.conf
> host all all 0.0.0.0/0   md5
>
> i needed to restart postgres service to be able to accept new connection,
> witch it strange becouse there was no load on the server and it head a lot
> of free ram.
>
>
>
>
> --
> *From: *"Tom Lane" 
> *To: *"Vucomir Ianculov" 
> *Cc: *pgsql-performance@postgresql.org
> *Sent: *Wednesday, January 25, 2017 3:15:28 PM
> *Subject: *Re: [PERFORM] pgsql connection timeone
>
> Vucomir Ianculov  writes:
> > i'm seeing a lot of connection time out in postgresql log
>
> > 2017-01-25 11:09:47 EET [6897-1] XXX@YYY FATAL: canceling
> authentication due to timeout
> > 2017-01-25 11:10:15 EET [6901-1] XXX@YYY FATAL: canceling
> authentication due to timeout
> > 2017-01-25 11:10:17 EET [6902-1] xxx@YYY FATAL: canceling
> authentication due to timeout
>
> So ... what authentication method are you using?
>
> regards, tom lane
>
>


Re: [PERFORM] Optimization inner join

2017-01-19 Thread Vitalii Tymchyshyn
Hi.

In SQL "null == any value" resolves to false, so optimizer can safely skip
nulls from either side if any for the inner join.

Best regards, Vitalii Tymchyshyn

NULL is still a value that may be paired with a NULL in a.a
>
> The only optimization I could see is if the a.a column has NOT NULL
> defined while b.b does not have NOT NULL defined.
>
> Not sure if it is all that common. Curious what if you put b.b IS NOT NULL
> in the WHERE statement?
>
> -
> Phillip Couto
>
>
>
>
>


Re: [PERFORM] Millions of tables

2016-09-28 Thread Vitalii Tymchyshyn
>
> Have you considered having many databases (e.g. 100) and possibly many
postgresql servers (e.g. 10) started on different ports?
This would give you 1000x less tables per db.

>
>>


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread Vitalii Tymchyshyn
Well, as I can see it was just few phrases unless I miss something. May be
it's worth to bring it to -hackers for a wider discussion?

Best regards, Vitalii Tymchyshyn

Сб, 30 січ. 2016 12:31 David Rowley <david.row...@2ndquadrant.com> пише:

> On 31 January 2016 at 06:14, Vitalii Tymchyshyn <v...@tym.im> wrote:
> > It may be more for -hackers, but I often hear "this wont be used because
> of
> > planning time increase". Now as I know we have statistics on real query
> time
> > after few runs that is used to decide if plan should be switched.
> > Can this statistics be used to apply advanced planning features for
> > relatively long running queries? E.g. a parameter like
> > sophisticated_planning_l1_threshold=500ms. If query runs over this
> > threshold, replan it with more sophisticated features taking few more
> > millis. Possibly different levels can be introduced. Also allow to set
> > threshold to 0, saying "apply to all queries right away".
> > Another good option is to threshold against cumulative query time. E.g.
> if
> > there was 1 runs 0.5 millis each, it may be beneficial to spend few
> > millis to get 0.2 millis each.
>
> I agree with you. I recently was working with long running queries on
> a large 3TB database. I discovered a new optimisation was possible,
> and wrote a patch to implement. On testing the extra work which the
> optimiser performed took 7 micoseconds, and this saved 6 hours of
> execution time. Now, I've never been much of an investor in my life,
> but a 3 billion times return on an investment seems quite favourable.
> Of course, that's quite an extreme case, but it's hard to ignore the
> benefit is still significant in less extreme cases.
>
> The idea you've mentioned here is very similar to what I bought up at
> the developer meeting a few days ago, see AOB section in [1]
>
> Unfortunately I didn't really get many of the correct people on my
> side with it, and some wanted examples of specific patches, which is
> completely not what I wanted to talk about. I was more aiming for some
> agreement for generic infrastructure to do exactly as you describe.
>
> [1]  https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting
>
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread Vitalii Tymchyshyn
It may be more for -hackers, but I often hear "this wont be used because of
planning time increase". Now as I know we have statistics on real query
time after few runs that is used to decide if plan should be switched.
Can this statistics be used to apply advanced planning features for
relatively long running queries? E.g. a parameter like
sophisticated_planning_l1_threshold=500ms. If query runs over this
threshold, replan it with more sophisticated features taking few more
millis. Possibly different levels can be introduced. Also allow to set
threshold to 0, saying "apply to all queries right away".
Another good option is to threshold against cumulative query time. E.g. if
there was 1 runs 0.5 millis each, it may be beneficial to spend few
millis to get 0.2 millis each.

Best regards, Vitalii Tymchyshyn

Сб, 30 січ. 2016 10:57 David Rowley <david.row...@2ndquadrant.com> пише:

> On 31 January 2016 at 01:30, Hedayat Vatankhah <hedayat@gmail.com>
> wrote:
> > Personally, I expect both queries below to perform exactly the same:
> >
> > SELECT
> > t1.id, *
> > FROM
> > t1
> > INNER JOIN
> > t2 ON t1.id = t2.id
> > where t1.id > -9223372036513411363;
> >
> > And:
> >
> > SELECT
> > t1.id, *
> > FROM
> > t1
> > INNER JOIN
> > t2 ON t1.id = t2.id
> > where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;
> >
> > Unfortunately, they do not. PostgreSQL creates different plans for these
> > queries, which results in very poor performance for the first one
> compared
> > to the second (What I'm testing against is a DB with around 350 million
> > rows in t1, and slightly less in t2).
> >
> > EXPLAIN output:
> > First query: http://explain.depesz.com/s/uauk
> > Second query: link: http://explain.depesz.com/s/uQd
>
> Yes, unfortunately you've done about the only thing that you can do,
> and that's just include both conditions in the query. Is there some
> special reason why you can't just write the t2.id > ... condition in
> the query too? or is the query generated dynamically by some software
> that you have no control over?
>
> I'd personally quite like to see improvements in this area, and even
> wrote a patch [1] which fixes this problem too. The problem I had when
> proposing the fix for this was that I was unable to report details
> about how many people are hit by this planner limitation. The patch I
> proposed caused a very small impact on planning time for many queries,
> and was thought by many not to apply in enough cases for it to be
> worth slowing down queries which cannot possibly benefit. Of course I
> agree with this, I've no interest in slowing down planning on queries,
> but at the same time understand the annoying poor optimisation in this
> area.
>
> Although please remember the patch I proposed was merely a first draft
> proposal. Not for production use.
>
> [1]
> http://www.postgresql.org/message-id/flat/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com#cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] High Planning Time

2016-01-22 Thread Vitalii Tymchyshyn
It can be because of catalog stats file leak. I had it on some older
PostgreSQL version when tests were rolling back schema changes often. The
file grew to some enormous size and as soon as it got out of cache - it
would be very slow to access catalog. I had just to kill the file.

Best regards, Vitalii Tymchyshyn

Чт, 21 січ. 2016 19:30 Phil S <pjsand...@gmail.com> пише:

> I am running Postgresql on a Windows Server 2008 server. I have noticed
> that queries have very high planning times now and then. Planning times go
> down for the same query immediately after the query runs the first time,
> but then go up again after if the query is not re-run for 5 minutes or so.
>
> I am not able to find any specific information in the documentation that
> would explain the issue or explains how to address it, so am asking for
> advice here.
>
> Here is an example.
>
> explain analyze
> select * from message
> limit 1
>
> "Limit  (cost=0.00..0.44 rows=1 width=1517) (actual time=0.009..0.009
> rows=1 loops=1)"
> "  ->  Seq Scan on message  (cost=0.00..28205.48 rows=64448 width=1517)
> (actual time=0.007..0.007 rows=1 loops=1)"
> "Planning time: 3667.361 ms"
> "Execution time: 1.652 ms"
>
> As you can see the query is simple and does not justify 3 seconds of
> planning time. It would appear that there is an issue with my configuration
> but I am not able to find anything that looks out of sorts in the query
> planning configuration variables. Any advice about what I should be looking
> for to fix this would be appreciated.
>
>


Re: [PERFORM] Partition Constraint Exclusion Limits

2015-10-27 Thread Vitalii Tymchyshyn
BTW: May be it could be feasible in future to perform partition exclusion
during the execution? This would be very neat feature.

Regards, Vitalii Tymchyshyn

Вт, 27 жовт. 2015 15:03 David G. Johnston <david.g.johns...@gmail.com> пише:

> On Tue, Oct 27, 2015 at 2:29 PM, GMail <mfwil...@gmail.com> wrote:
>
>> I have partitioned a large table in my PG database (6.7 billion rows!) by
>> a date column and in general constraint exclusion works well but only in
>> relatively simple case when the partition key is specified exactly as
>> created in the CHECK constraint.  I'm curious if there is a way to get it
>> to work a little more generally though.
>>
>> For example my CHECK constraint (see code below) specifying a hard-coded
>> field value works well (#1 and #2).  Specifying a function that returns a
>> value even though it is the appropriate type scans all of the partitions
>> (#3) unfortunately.  Likewise any join, CTE, or sub-query expression, even
>> for a single row that returns the correct type also results in a scan of
>> all of the partitions.
>>
>> I was curious if there was a way specifically to get #3 to work as the
>> WHERE predicate in this case is stored as an integer but the table itself
>> is partitioned by the appropriate date type.  I believe I could work around
>> this issue with dynamic sql in a function but there are lots of cases of
>> this type of simple conversion and I wanted to avoid the maintenance of
>> creating a function per query.
>>
>
> ​Short answer, no.
>
> The planner has the responsibility for performing constraint exclusion and
> it only has access to constants during its evaluation.  It has no clue what
> kind of transformations a function might do.  Various other optimizations
> are indeed possible but are not presently performed.
>
> ​So, #3 (
> to_date(201406::text||01::text, 'MMDD');
> ​) ​
> is down-right impossible given the present architecture
> ​; and likely any future architecture.
>
> With #4 (
> explain analyze select count(1) from ptest.tbl where dt = (select
> '2014-06-01'::date);
> ​) ​
> in theory the re-write module could recognize and re-write this remove the
> sub-select.
> ​  But likely real-life is not so simple otherwise the query writer likely
> would have simply done is directly themself.
>
> ​
> ​
> ​
> ​In a partitioning scheme the partitioning data has to be injected into
> the query explicitly so that it is already in place before the planner
> receives the query.  Anything within the query requiring "execution" is
> handled by the executor and at that point the chance to exclude partitions
> has come and gone.
>
> David J.
>


Re: [PERFORM] New server: SSD/RAID recommendations?

2015-07-07 Thread Vitalii Tymchyshyn
Hi.

How would BBU cache help you if it lies about fsync? I suppose any RAID
controller removes data from BBU cache after it was fsynced by the drive.
As I know, there is no other magic command for drive to tell controller
that the data is safe now and can be removed from BBU cache.

Вт, 7 лип. 2015 11:59 Graeme B. Bell graeme.b...@nibio.no пише:


 Yikes. I would not be able to sleep tonight if it were not for the BBU
 cache in front of these disks...

 diskchecker.pl consistently reported several examples of corruption
 post-power-loss (usually 10 - 30 ) on unprotected M500s/M550s, so I think
 it's pretty much open to debate what types of madness and corruption you'll
 find if you look close enough.

 G


 On 07 Jul 2015, at 16:59, Heikki Linnakangas hlinn...@iki.fi wrote:

 
  So it lies about fsync()... The next question is, does it nevertheless
 enforce the correct ordering of persisting fsync'd data? If you write to
 file A and fsync it, then write to another file B and fsync it too, is it
 guaranteed that if B is persisted, A is as well? Because if it isn't, you
 can end up with filesystem (or database) corruption anyway.
 
  - Heikki



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] union all and filter / index scan - seq scan

2015-05-21 Thread Vitalii Tymchyshyn
It looks pretty much like partitioning. You should check partitioning
recipes.

Чт, 21 трав. 2015 06:41 Florian Lohoff f...@zz.de пише:


 Hi,
 i stumbled over something i cant seem to find a workaround. I create a
 view like

 create view v_test as
 select  a,b
 frombig_table
 union all
 select  a,b
 fromsmall_table;

 When i now use the view like

 select * from v_test where a = 42;

 I can see an index scan happening on big_table. When i issue
 something like

 select * from v_test where a in ( select 42 );

 or joining to another table i see that there will be seq scan on big
 table. First the union will be executed and later the filter e.g. a in (
 select 42 ) will be done on the huge result. My use case is that
 big_table is 70mio entries growing fast and small_table is like 4
 entries, growing little.  The filter e.g. a in ( select 42 ) will
 typically select 50-1000 entries of the 70mio. So i now create a union
 with 70mio + 4 entries to then filter all with a = 42.

 It seems the planner is not able to rewrite a union all e.g. the above
 statement could be rewritten from:

 select  *
 from(
 select  a,b
 frombig_table
 union all
 select  a,b
 fromsmall_table;
 ) foo
 where   a in ( select 42 );

 to

 select  *
 from(
 select  a,b
 frombig_table
 where a in ( select 42 )
 union all
 select  a,b
 fromsmall_table
 where a in ( select 42 )
 ) foo

 which would then use an index scan not a seq scan and execution times
 would be acceptable.

 I have now tried to wrap my head around the problem for 2 days and i am
 unable to find a workaround to using a union but the filter optimisation
 is impossible with a view construct.

 Flo
 PS: Postgres 9.1 - I tried 9.4 on Debian/jessie with IIRC same results.
 --
 Florian Lohoff f...@zz.de
  We need to self-defense - GnuPG/PGP enable your email today!



Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-05-21 Thread Vitalii Tymchyshyn
It may be even easier. AFAIR, it's possible just to tell OS expected
allocation without doing it. This way nothing changes for general code,
it's only needed to specify expected file size on creation.

Please see FILE_ALLOCATION_INFO:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364214(v=vs.85).aspx

Чт, 21 трав. 2015 16:39 Andres Freund and...@anarazel.de пише:

 On 2015-05-21 11:54:40 -0700, Josh Berkus wrote:
  This has been talked about as a feature, but would require major work on
  PostgreSQL to make it possible.  You'd be looking at several months of
  effort by a really good hacker, and then a whole bunch of performance
  testing.  If you have the budget for this, then please let's talk about
  it because right now nobody is working on it.

 I think this is overestimating the required effort quite a bit. While
 not trivial, it's also not that complex to make this work.

 Greetings,

 Andres Freund


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Performance issues

2015-03-18 Thread Vitalii Tymchyshyn
You can set it for the db user or use stored proc.

Best regards, Vitalii Tymchyshyn

Ср, 18 бер. 2015 14:48 Vivekanand Joshi vjo...@zetainteractive.com пише:

 The issue here is that the queries are running inside a Jasper Reports. So
 we cannot set this only for a one single query.

 We are accessing our reports from a web-browser, which in turn runs the
 report from Application Server (Jasper). This server connects to
 PostgreSQL server.

 Inside a JRXML(Jasper report file) file we cannot set this parameter.

 I am attaching a JRXML file for a feel.  You can open this file in
 notepad. I don't think we can set server level property in this file. So
 how about a workaround?

 Vivek



 -Original Message-
 From: Jerry Sievers [mailto:gsiever...@comcast.net]
 Sent: Thursday, March 19, 2015 12:06 AM
 To: vjo...@zetainteractive.com
 Cc: Tomas Vondra; pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Performance issues

 Vivekanand Joshi vjo...@zetainteractive.com writes:

  So, here is the first taste of success and which gives me the
  confidence that if properly worked out with a good hardware and proper
  tuning, PostgreSQL could be a good replacement.
 
  Out of the 9 reports which needs to be migrated in PostgreSQL, 3 are
  now running.
 
  Report 4 was giving an issue and I will see it tomorrow.
 
  Just to inform you guys that, the thing that helped most is setting
  enable_nestloops to false worked. Plans are now not miscalculated.
 
  But this is not a production-suitable setting. So what do you think
  how to get a work around this?

 Consider just disabling that setting for 1 or a few odd queries you have
 for which they are known  to plan badly.

 begin;
 set local enable_nestloops to false;
 select ...;
 commit/abort;

 I'd say never make that sort of setting DB or cluster-wide.


 
 
  Regards,
  Vivek
 
  -Original Message-
  From: pgsql-performance-ow...@postgresql.org
  [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Tomas
  Vondra
  Sent: Tuesday, March 17, 2015 9:00 PM
  To: pgsql-performance@postgresql.org
  Subject: Re: [PERFORM] Performance issues
 
  On 17.3.2015 16:24, Thomas Kellerer wrote:
  Tomas Vondra schrieb am 17.03.2015 um 15:43:
  On 17.3.2015 15:19, Thomas Kellerer wrote:
  Tomas Vondra schrieb am 17.03.2015 um 14:55:
   (2) using window functions, e.g. like this:
 
   SELECT * FROM (
 SELECT *,
  ROW_NUMBER() OVER (PARTITION BY touchpoint_execution_id
 ORDER BY FROM max_creation_dt) AS rn
 FROM s_f_touchpoint_execution_status_history
   ) foo WHERE rn = 1
 
   But estimating this is also rather difficult ...
 
 
  From my experience rewriting something like the above using
  DISTINCT ON is usually faster.
 
  How do you get the last record (with respect to a timestamp column)
  using a DISTINCT ON?
 
  You need to use order by ... desc. See here:
  http://sqlfiddle.com/#!15/d4846/2
 
  Nice, thanks!
 
 
  Btw: your row_number() usage wouldn't return the latest row either.
  It would return the oldest row.
 
  Oh, right. I forgot the DESC in the window.
 
 
  --
  Tomas Vondrahttp://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services
 
 
  --
  Sent via pgsql-performance mailing list
  (pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance

 --
 Jerry Sievers
 Postgres DBA/Development Consulting
 e: postgres.consult...@comcast.net
 p: 312.241.7800

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Cursor + upsert (astronomical data)

2014-07-27 Thread Vitalii Tymchyshyn
I am not sure I understand the problem fully, e.g. what to do if there are
observations A,B and C with A to B and B to C less then treshold and A to C
over treshold, but anyway.

Could you first apply a kind of grid to your observations? What I mean is
to round your coords to, say, 1/2 arcsec on each axe and group the results.
I think you will have most observations grouped this way and then use your
regular algorithm to combine the results.

Best regards, Vitalii Tymchyshyn


Re: [PERFORM] Cursor + upsert (astronomical data)

2014-07-27 Thread Vitalii Tymchyshyn
Well, that's why I said to apply regular algorithm to deduplicate after
this step. Basically, what I expect is to have first pass with group by
that do not require any joins and produces dirty set of identifiers.

It should do next things:
1) Provide working set of dirty identifiers that has a huge factor less
cardinality than the original observations set.
2) Most of the identifiers can be used as is, only for small fraction you
need to perform additional merge. 22% is actually very good number, it
means only 1/5 of identifiers should be analyzed for merging.

Best regards, Vitalii Tymchyshyn
27 лип. 2014 10:35, користувач Jiří Nádvorník nadvornik...@gmail.com
написав:

 Hi Vitalii, thank you for your reply.



 The problem you suggested can in the most pathological way be, that these
 observations are on one line. As you suggested it, the B would be in the
 middle. So A and C are not in 1 arcsec range of each other, but they must
 be within 1 arcsec of their common average coordinates. If the distances
 between A,B,C are 1 arcsec for each, the right solution is to pick B as
 reference identifier and assign A and C to it.



 We already tried the approach you suggest with applying a grid based on
 the Q3C indexing of the database. We were not just rounding the results,
 but using the center of the Q3C “square” in which the observation took
 place. The result was poor however – 22% of the identifiers were closer to
 each other than 1 arcsec. That means that when you crossmatch the original
 observations to them, you don’t know which one to use and you have
 duplicates. The reason for this is that nearly all of the observations are
 from SMC (high density of observations), which causes that you have more
 than 2 “rounded” positions in a row and don’t know which ones to join
 together (compute average coordinates from it). If it is not clear enough I
 can draw it on an image for you.

 Maybe the simple round up would have better results because the squares
 are not each the same size and you can scale them only by 2 (2-times
 smaller, or larger square). We used a squre with the side cca 0.76 arcsec
 which approximately covers the 1 arcsec radius circle.



 Oh and one more important thing. The difficulty of our data is not that it
 is 3e8 rows. But in the highest density, there are cca 1000 images
 overlapping. Which kills you when you try to self-join the observations to
 find neighbours for each of them – the quadratic complexity is based on the
 overlappingon the image (e.g. 1 observations on one image with another
 999 images overlapping it means 1 *1000^2).



 Best regards,



 Jiri Nadvornik



 *From:* tiv...@gmail.com [mailto:tiv...@gmail.com] *On Behalf Of *Vitalii
 Tymchyshyn
 *Sent:* Sunday, July 27, 2014 8:06 AM
 *To:* Jiří Nádvorník
 *Cc:* pgsql-performance@postgresql.org
 *Subject:* Re: [PERFORM] Cursor + upsert (astronomical data)



 I am not sure I understand the problem fully, e.g. what to do if there are
 observations A,B and C with A to B and B to C less then treshold and A to C
 over treshold, but anyway.

 Could you first apply a kind of grid to your observations? What I mean is
 to round your coords to, say, 1/2 arcsec on each axe and group the results.
 I think you will have most observations grouped this way and then use your
 regular algorithm to combine the results.

 Best regards, Vitalii Tymchyshyn



Re: [PERFORM] Optimize query for listing un-read messages

2014-05-02 Thread Vitalii Tymchyshyn
What statistics do you have on the data? I suppose most messages are read
by low number of users, mostly 0 or one.
I can see two options to consider:
1) Use arrays to store information on which users have already read the
message. You may need GIN/GIST index to search fast.
2) Introduce some kind of special column(s) for the cases when the message
is unread by everybody or was read by at most one user. E.g. read_by
columns with null value for unread, special value for read by many and real
user if read by only one.
in this case your condition would be (read_by is null or read_by not in
(current_user or special_value) or (read_by = special_value and not
exists()). Note that optimizer may have problems with such a complex
expression nd you may need to use union all instead on or. Partial
index(es) for null/special value may help.

Best regards, Vitalii Tymchyshyn


2014-05-02 10:20 GMT+03:00 Andreas Joseph Krogh andr...@visena.com:

 På fredag 02. mai 2014 kl. 02:17:58, skrev Craig James 
 cja...@emolecules.com:

 On Thu, May 1, 2014 at 4:26 AM, Andreas Joseph Krogh 
 andr...@visena.comwrote:

 I have a schema where I have lots of messages and some users who might
 have read some of them. When a message is read by a user I create an entry
 i a table message_property holding the property (is_read) for that user.

 The schema is as follows:

 [...]


 create table person(
 id serial primary key,
 username varchar not null unique
 );

 create table message(
 id serial primary key,
 subject varchar
 );

 create table message_property(
 message_id integer not null references message(id),
 person_id integer not null references person(id),
 is_read boolean not null default false,
 unique(message_id, person_id)
 );


 [...]

  So, for person 1 there are 10 unread messages, out of a total 1mill. 5
 of those unread does not have an entry in message_property and 5 have an
 entry and is_read set to FALSE.


 Here's a possible enhancement: add two columns, an indexed timestamp to
 the message table, and a timestamp of the oldest message this user has NOT
 read on the person table. If most users read messages in a timely fashion,
 this would (in most cases) narrow down the portion of the messages table to
 a tiny fraction of the total -- just those messages newer than the oldest
 message this user has not read.

 When you sign up a new user, you can set his timestamp to the time the
 account was created, since presumably messages before that time don't apply.

 Whether this will help depends a lot on actual use patterns, i.e. do users
 typically read all messages or do they leave a bunch of unread messages
 sitting around forever?


 Thanks fort the suggestion. A user must be able to read arbitrary old
 messages, and messages don't expire.

  --
 *Andreas Jospeh Krogh*
 CTO / Partner - Visena AS
 Mobile: +47 909 56 963
 andr...@visena.com
 www.visena.com
  https://www.visena.com




Re: [PERFORM] Extremely slow server?

2013-09-15 Thread Vitalii Tymchyshyn
Can it be hardware problem with io? Try finding out which file the stuck
table is and do a simple fs copy. Or simply do a copy of the whole pg data
directory.
15 вер. 2013 04:54, Craig James cja...@emolecules.com напис.

 On Sat, Sep 14, 2013 at 11:36 AM, bricklen brick...@gmail.com wrote:

 On Sat, Sep 14, 2013 at 11:28 AM, Craig James cja...@emolecules.comwrote:

 I'm trying to do a pg_dump of a database, and it more-or-less just sits
 there doing nothing.


 What is running in the db? Perhaps there is something blocking the
 pg_dump? What does the output of the following query look like?

 select * from pg_stat_activity where pid  pg_backend_pid()


 =# select * from pg_stat_activity where pid  pg_backend_pid();
  datid  |  datname   |  pid  | usesysid | usename  | application_name |
 client_addr | client_hostname | client_port | backend_start
  |  xact_start   | query_start  |
 state_change  | waiting | state  |
   query

 ++---+--+--+--+-+-+-+--

 -+---+--+---+-++---
 -
  231308 | emolecules | 13312 |   10 | postgres | pg_dump
 | | |  -1 | 2013-09-14
 18:37:08.752938-07
  | 2013-09-14 18:37:08.783782-07 | 2013-09-14 18:39:43.74618-07 |
 2013-09-14 18:39:43.746181-07 | f   | active | COPY
 orders.chmoogle_thesaurus
  (thesaurus_id, version_id, normalized, identifier, typecode) TO stdout;
 (1 row)

 And a bit later:

  231308 | emolecules | 13312 |   10 | postgres | pg_dump
 | | |  -1 | 2013-09-14
 18:37:08.752938-07
  | 2013-09-14 18:37:08.783782-07 | 2013-09-14 18:47:38.287109-07 |
 2013-09-14 18:47:38.287111-07 | f   | active | COPY
 orders.customer_order_items (customer_order_item_id, customer_order_id,
 orig_smiles, orig_sdf, orig_datatype, orig_catalog_num, orig_rownum,
 cansmiles, version_id, version_smiles, parent_id, match_type, catalogue_id,
 supplier_id, sample_id, catalog_num, price_code, reason, discount,
 format_ordered, amount_ordered, units_ordered, format_quoted,
 amount_quoted, units_quoted, price_quoted, wholesale, nitems_shipped,
 nitems_analytical, comment, salt_name, salt_ratio, original_order_id,
 price_quoted_usd, wholesale_usd, invoice_price, hazardous) TO stdout;
 (1 row)

 The Apache web server is shut off, and I upgraded to Postgres 9.2.4 since
 my first email.

 top(1) reports nothing interesting that I can see:

 top - 18:50:18 up 340 days,  3:28,  4 users,  load average: 1.46, 1.40,
 1.17
 Tasks: 282 total,   1 running, 281 sleeping,   0 stopped,   0 zombie
 Cpu(s):  0.2%us,  0.1%sy,  0.0%ni, 86.9%id, 12.7%wa,  0.0%hi,  0.0%si,
 0.0%st
 Mem:  12322340k total, 11465028k used,   857312k free,53028k buffers
 Swap: 19796984k total,50224k used, 19746760k free, 10724856k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 13311 emi   20   0 28476 8324 1344 S1  0.1   3:48.90 pg_dump
 13359 emi   20   0 19368 1576 1076 R0  0.0   0:00.09 top
 1 root  20   0 23840 1372  688 S0  0.0   0:03.13 init
 2 root  20   0 000 S0  0.0   0:00.04 kthreadd
 3 root  RT   0 000 S0  0.0   0:00.27 migration/0
 4 root  20   0 000 S0  0.0   0:05.85 ksoftirqd/0
 ... etc.


 Interestingly, it starts off going fairly fast according to vmstat 2.
 This was started almost immediately after pg_dump started.  Notice that it
 goes well for a couple minutes, slows for a bit, speeds up, then drops to
 almost nothing.  It stays that way forever, just not doing anything.  See
 below.

 Thanks,
 Craig

 procs  ---memory--  ---swap-- -io -system--
 cpu
  r  b   swpdfree   buff   cache   si   sobibo   in   cs us sy
 id wa
  1  0  50492 4129584  53672 745589200463700  0  0
 99  0
  1  0  50488 4091536  53672 749407200 15744 0  389  717 12  0
 87  0
  3  0  50480 4053868  53672 753183600 1568036  408  759 17  0
 83  0
  1  0  50480 4016252  53680 756963600 15580 6  379  700 12  0
 87  0
  2  0  50480 3979140  53680 760680000 15360 0  372  682 17  0
 82  0
  1  1  50480 3943536  53696 764274000 14720 68846 2076 1343 11  1
 81  7
  1  0  50448 3906040  53700 768004400 15488 2  456  760 18  0
 80  1
  1  0  50424 3866036  53704 772043600 16818 0  389  713 13  1
 86  0
  1  0  50424 3813564  53712 777265600 2195216  424  772 18  1
 81  0
  1  0  50424 3762460  53712 782384000 21376 0  414  761 13  0
 87  0
  1  0  50424 3710416  53720 787596800 

Re: [PERFORM] Varchar vs foreign key vs enumerator - table and index size

2013-09-03 Thread Vitalii Tymchyshyn
Well, in older version of Hibernate it was a little tricky to handle
Postgresql Enums. Dunno if it's out of the box now.
Also adding new value is an explicit operation (much like with lookup
table). I've had quite a complex code with second connection opening to
support lookup table filling without flooding original transaction with
additional locks that could lead to deadlocks.
BTW: Does adding new value to enum adds some locks? Can a check if value
exists and adding new value be done in atomic fashion without grabbing some
global lock?
P.S. As  I see, it can be a topic for good article for, say, dzone. The
problem can be quite tricky in MVCC database and choice must be done wisely.

Best regards, Vitalii Tymchyshyn


2013/9/2 Andrew Dunstan and...@dunslane.net


 On 09/02/2013 05:53 AM, Łukasz Walkowski wrote:

 On 1 wrz 2013, at 05:10, Vitalii Tymchyshyn tiv...@gmail.com wrote:


 Well, there are some more options:
 a) Store int keys and do mapping in the application (e.g. with java
 enums). This can save you a join, that is especially useful if you are
 going to do paged output with limit/offset scenario. Optimizer sometimes
 produce suboptimal plans for join in offset/limit queries.
 b) Store small varchar values as keys (up to char type if you really
 want to save space) and do user display mapping in application. It's
 different from (a) since it's harder to mess with the mapping and values
 are still more or less readable with simple select. But it can be less
 efficient than (a).
 c) Do mixed approach with mapping table, loaded on start into
 application memory. This would be an optimization in case you get into
 optimizer troubles.

 Best regards, Vitalii Tymchyshyn

 I'd like to leave database in readable form because before I add some new
 queries and rest endpoints to the application, I test them as ad-hoc
 queries using command line. So variant a) isn't good for me. Variant b) is
 worth trying and c) is easy to code, but I still prefer having all this
 data in database independent of application logic.



 I think the possible use of Postgres enums has been too easily written off
 in this thread. Looking at the original problem description they look like
 quite a good fit, despite the OP's skepticism. What exactly is wanted that
 can't be done with database enums? You can add new values to the type very
 simply.  You can change the values of existing labels in the type slightly
 less simply, but still without any great difficulty. Things that are hard
 to do include removing labels in the set and changing the sort order,
 because those things would require processing tables where the type is
 used, unlike the simple things. But neither of these is required for
 typical use cases. For most uses of this kind they are very efficient both
 in storage and processing.

 cheers

 andrew




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Varchar vs foreign key vs enumerator - table and index size

2013-08-31 Thread Vitalii Tymchyshyn
2013/8/31 Łukasz Walkowski lukasz.walkow...@homplex.pl


 3. And this part is most interesting for me. Columns browser, eventsource,
 eventtype, devicetype, operatingsystem contain a small pool of strings -
 for example for devicetype this is set to Computer, Mobile, Tablet or
 Unknown. Browser is set to normalized browser name. In every case I can
 store those data using one of 3 different methods:


Well, there are some more options:
a) Store int keys and do mapping in the application (e.g. with java enums).
This can save you a join, that is especially useful if you are going to do
paged output with limit/offset scenario. Optimizer sometimes produce
suboptimal plans for join in offset/limit queries.
b) Store small varchar values as keys (up to char type if you really want
to save space) and do user display mapping in application. It's different
from (a) since it's harder to mess with the mapping and values are still
more or less readable with simple select. But it can be less efficient than
(a).
c) Do mixed approach with mapping table, loaded on start into application
memory. This would be an optimization in case you get into optimizer
troubles.

Best regards, Vitalii Tymchyshyn


Re: [PERFORM] Trying to eliminate union and sort

2013-07-17 Thread Vitalii Tymchyshyn
I'd try to check why discounts are different. Join with 'or' should work.
Build (one query) except all (another query) and check some rows from
result.
 13 лип. 2013 01:28, Brian Fehrle bri...@consistentstate.com напис.

 On 07/11/2013 06:46 PM, Josh Berkus wrote:

 Brian,

  3. I'm trying to eliminate the union, however I have two problems.
 A) I can't figure out how to have an 'or' clause in a single join that
 would fetch all the correct rows. If I just do:
 LEFT OUTER JOIN table2 t2 ON (t2.real_id = t.id OR t2.real_id =
 t.backup_id), I end up with many less rows than the original query. B.

 I believe the issue with this is a row could have one of three
 possibilities:
 * part of the first query but not the second - results in 1 row after
 the union
 * part of the second query but not the first - results in 1 row after
 the union
 * part of the first query and the second - results in 2 rows after the
 union (see 'B)' for why)

 B) the third and fourth column in the SELECT will need to be different
 depending on what column the row is joined on in the LEFT OUTER JOIN to
 table2, so I may need some expensive case when logic to filter what is
 put there based on whether that row came from the first join clause, or
 the second.

 No, it doesn't:

 SELECT t.id,
 t.mycolumn1,
 table3.otherid as otherid1,
 table3a.otherid as otherid2,
 t.mycolumn2
 FROM t
 LEFT OUTER JOIN table2
ON ( t.id = t2.real_id OR t.backup_id = t2.real_id )
 LEFT OUTER JOIN table3
ON ( t.typeid = table3.id )
  LEFT OUTER JOIN table3 as table3a
 ON ( table2.third_id = table3.id )
 WHERE t.external_id IN ( ... )
 ORDER BY t.mycolumn2, t.id

 I tried this originally, however my resulting rowcount is different.

 The original query returns 9,955,729 rows
 This above one returns 7,213,906

 As for the counts on the tables:
 table1  3,653,472
 table2  2,191,314
 table325,676,589

 I think it's safe to assume right now that any resulting joins are not
 one-to-one

 - Brian F




 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.**
 org pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-performancehttp://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Deleting Rows From Large Tables

2013-05-18 Thread Vitalii Tymchyshyn
Analyze your temp tables after filling and before using!
17 трав. 2013 17:27, Sékine Coulibaly scoulib...@gmail.com напис.

 Oh, sorry, overlooked that part.
 Maybe refreshing stats with VACUUM FULL ?


 2013/5/17 Robert Emery robertem...@codeweavers.net

 Hi Sékine,

 Unfortunately I'm not trying to empty the table completely, just
 delete about 10-15% of the data in it.

 Thanks,

 On 17 May 2013 14:11, Sékine Coulibaly scoulib...@gmail.com wrote:
  Rob,
 
  Did you tried TRUNCATE ?
  http://www.postgresql.org/docs/8.4/static/sql-truncate.html
 
  This is is supposed to be quicker since it does scan the table.
 
  Regards
 
 
  2013/5/17 Rob Emery re-pg...@codeweavers.net
 
  Hi All,
 
  We've got 3 quite large tables that due to an unexpected surge in
  usage (!) have grown to about 10GB each, with 72, 32 and 31 million
  rows in. I've been tasked with cleaning out about half of them, the
  problem I've got is that even deleting the first 1,000,000 rows seems
  to take an unreasonable amount of time. Unfortunately this is on quite
  an old server (Dell 2950 with a RAID-10 over 6 disks) running Postgres
  8.4; which serves other things like our logging systems.
 
  If I run a sustained (more than about 5 minutes) delete it'll have a
  detrimental effect on the other services. I'm trying to batch up the
  deletes into small chunks of approximately 1 month of data ; even this
  seems to take too long, I originally reduced this down to a single
  day's data and had the same problem. I can keep decreasing the size of
  the window I'm deleting but I feel I must be doing something either
  fundamentally wrong or over-complicating this enormously. I've
  switched over to retrieving a list of IDs to delete, storing them in
  temporary tables and deleting based on the primary keys on each of the
  tables with something similar to this:
 
  BEGIN TRANSACTION;
 
  CREATE TEMPORARY TABLE table_a_ids_to_delete (id INT);
  CREATE TEMPORARY TABLE table_b_ids_to_delete (id INT);
 
  INSERT INTO table_a_ids_to_delete
  SELECT id FROM table_a WHERE purchased ='-infinity' AND created_at
   '2007-01-01T00:00:00';
 
  INSERT INTO table_b_ids_to_delete
  SELECT table_b_id FROM table_a_table_b_xref
  INNER JOIN table_a_ids_to_delete ON (table_a_ids_to_delete.id =
  table_a_table_b.quote_id);
 
  DELETE FROM table_a_table_b_xref USING table_a_ids_to_delete
  WHERE table_a_table_b_xref.table_a_id = table_a_ids_to_delete.id;
 
  DELETE FROM table_b USING table_b_ids_to_delete
  WHERE table_b.id = table_b_ids_to_delete.id;
 
  DELETE FROM table_a USING table_a_ids_to_delete
  WHERE table_a.id =  table_a_ids_to_delete.id;
 
  COMMIT;
 
  There're indices on table_a on the queried columns, table_b's primary
  key is it's id, and table_a_table_b_xref has an index on (table_a_id,
  table_b_id). There're FK defined on the xref table, hence why I'm
  deleting from it first.
 
  Does anyone have any ideas as to what I can do to make the deletes any
  faster? I'm running out of ideas!
 
  Thanks in advance,
 
  --
  Rob Emery
 
 
  --
  Sent via pgsql-performance mailing list (
 pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 
 



 --
 Robert Emery
 Database Administrator

 | T: 0800 021 0888 | www.codeweavers.net |
 | Codeweavers Limited | Barn 4 | Dunston Business Village | Dunston |
 ST18 9AB |
 | Registered in England and Wales No. 04092394 | VAT registration no.
 974 9705 63 |

  CUSTOMERS' BLOG TWITTER   FACEBOOK  LINKED IN
 DEVELOPERS' BLOG  YOUTUBE





Re: [PERFORM] In progress INSERT wrecks plans on table

2013-05-10 Thread Vitalii Tymchyshyn
Well, could you write a trigger that would do what you need? AFAIR analyze
data is stored no matter transaction boundaries. You could store some
counters in session vars and issue an explicit analyze when enough rows
were added.
7 трав. 2013 08:33, Mark Kirkwood mark.kirkw...@catalyst.net.nz напис.

 On 07/05/13 18:10, Simon Riggs wrote:

 On 7 May 2013 01:23,  mark.kirkw...@catalyst.net.nz** wrote:

  I'm thinking that a variant of (2) might be simpler to inplement:

 (I think Matt C essentially beat me to this suggestion - he originally
 discovered this issue). It is probably good enough for only *new* plans
 to
 react to the increased/increasing number of in progress rows. So this
 would require backends doing significant numbers of row changes to either
 directly update pg_statistic or report their in progress numbers to the
 stats collector. The key change here is the partial execution numbers
 would need to be sent. Clearly one would need to avoid doing this too
 often (!) - possibly only when number of changed rows 
 autovacuum_analyze_scale_**factor proportion of the relation concerned
 or
 similar.


 Are you loading using COPY? Why not break down the load into chunks?


 INSERT - but we could maybe workaround by chunking the INSERT. However
 that *really* breaks the idea that in SQL you just say what you want, not
 how the database engine should do it! And more practically means that the
 most obvious and clear way to add your new data has nasty side effects, and
 you have to tip toe around muttering secret incantations to make things
 work well :-)

 I'm still thinking that making postgres smarter about having current stats
 for getting the actual optimal plan is the best solution.

 Cheers

 Mark



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.**
 org pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-performancehttp://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Bad Execution Plan with OR Clauses Across Outer-Joined Tables

2013-04-30 Thread Vitalii Tymchyshyn
What I can say is that hibernate has exists in both HQL and criteria API
(e.g. see
http://www.cereslogic.com/pages/2008/09/22/hibernate-criteria-subqueries-exists/
for
criteria). So, may be it's easier for you to tune your hibernate query to
use exists


2013/4/30 Mark Hampton m...@cleverdba.com

 I have a Hibernate-generated query (That's not going to change, so let's
 just focus on the Postgres side for now) like this:

 SELECT *
 from PERSON p
 where p.PERSON_ID in (
select distinct p2.PERSON_ID
from PERSON p2
 left outer join PERSON_ALIAS pa on
   p2.PERSON_ID = pa.PERSON_ID
where (lower(p1.SURNAME) = 'duck' or
  lower(pa.SURNAME) = 'duck') and
  (lower(p1.FORENAME) = 'donald' or
  lower(pa.FORENAME) = 'donald')
   )
 order by p.PERSON_ID asc;

 There are function-based indexes on PERSON and PERSON_ALIAS as follows:

 CREATE INDEX PERSON_FORENAME_LOWER_FBIDX ON PERSON (LOWER(FORENAME)
 VARCHAR_PATTERN_OPS);
 CREATE INDEX PERSON_SURNAME_LOWER_FBIDX ON PERSON (LOWER(SURNAME) VARCHAR
 _PATTERN_OPS);
 CREATE INDEX PERSON_ALIAS_FORENAME_LOWER_FBIDX ON PERSON_ALIAS
 (LOWER(FORENAME) VARCHAR_PATTERN_OPS);
 CREATE INDEX PERSON_ALIAS_SURNAME_LOWER_FBIDX ON PERSON_ALIAS
 (LOWER(SURNAME) VARCHAR_PATTERN_OPS);

 The problem is that the above query doesn't use the indexes.  The or
 clauses across the outer-join seem to be the culprit.  If I rewrite the
 query as follows, Postgres will use the index:

 SELECT *
 from PERSON p
 where (p.PERSON_ID in (
 select p2.PERSON_ID
 from TRAVELER.PERSON p2
  join TRAVELER.OTHER_NAME pa on p2.PERSON_ID =
pa.PERSON_ID
 where lower(p2.SURNAME) = 'duck' and
   lower(pa.FORENAME) = 'donald'
   ) or
   p.PERSON_ID in (
select p2.PERSON_ID
from TRAVELER.PERSON p2
 join TRAVELER.OTHER_NAME pa on p2.PERSON_ID =
   pa.PERSON_ID
where lower(pa.SURNAME) = 'duck' and
  lower(p2.FORENAME) = 'donald'
   ) or
   p.PERSON_ID in (
select p2.PERSON_ID
from TRAVELER.PERSON p2
where lower(p2.SURNAME) = 'duck' and
  lower(p2.FORENAME) = 'donald'
   ) or
   p.PERSON_ID in (
select p2.PERSON_ID
from TRAVELER.OTHER_NAME pa
where lower(pa.SURNAME) = 'duck' and
  lower(pa.FORENAME) = 'donald'
   ))
 order by p.PERSON_ID asc;

 So my question is this: Is there a way to get the Postgres optimizer
 rewrite the query execution plan to use the equivalent, but much more
 efficient latter form?

 And before you ask; yes, there are better ways of writing this query.  But
 we're dealing with Java developers and Hibernate here.  It's a legacy
 system, and the policy is to avoid hand-written SQL, so for the moment
 let's not go down that rabbit hole, and focus on the issue of what the
 optimizer can and cannot do.




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Avoiding Recheck Cond when using Select Distinct

2013-02-22 Thread Vitalii Tymchyshyn
2013/2/22 jackrg j...@groundbreakingsoftware.com

 Tuning Postgre is not an option, as the instance
 is provided by Heroku and as far as I know cannot be tuned by me.

 Most tuning parameters can be set at per-query basis, so you can issue
alter database set param=value
to have same effect as if it was set through postgresql.conf.


Re: [PERFORM] slow query plans caused by under-estimation of CTE cardinality

2013-02-18 Thread Vitalii Tymchyshyn
Since cte is already an optimization fence, you can go further and make it
temporary table.
Create table;analyze;select should make optimizer's work much easier.
18 лют. 2013 18:45, John Lumby johnlu...@hotmail.com напис.


 On 2012-10-09 23:09:21
 Tom Lane wrote:
  


  re subject Why am I getting great/terrible estimates with these CTE
 queries?
   You're assuming the case where the estimate is better is better for a
  reason ... but it's only better as a result of blind dumb luck.  The
  outer-level query planner doesn't know anything about the CTE's output
  except the estimated number of rows --- in particular, it doesn't drill
  down to find any statistics about the join column.
 

 I am also struggling with a problem involving CTEs although in my case
 it is caused by huge *under*-estimation of cardinality rather then
 *over*-estimation.
 The statement is quite complex and the problem arises because there is a
 chain of
 RECURSIVE CTEs each defined as a query involving an earlier CTE and more
 tables.
 Eventually there is no hope for making a good cardinality estimate.

 One CTE in particular has a cardinality estimate of 1  (I guess the actual
 estimate is nearer zero and rounded up) but actual count is over 10.
 The planner puts this CTE as inner of a nested loop accessed by simple
 linear CTE scan
 and the full query then takes over 20 minutes.

-  Nested Loop  (cost=0.00..0.06 rows=1 width=588) (actual
 time=2340.421..1201593.856 rows=105984 loops=1)
   Join Filter: ((winnum.subnet_id = binoptasc.subnet_id) AND
 (winnum.option_code = binoptasc.option_code) AND
 ((winnum.option_discriminator)::text =
 (binoptasc.option_discriminator)::text) AND (winnum.net_rel_level =
 binoptasc.net_rel_level))
   Rows Removed by Join Filter: 7001612448
   Buffers: shared hit=2290941
   -  CTE Scan on winning_option_nums winnum  (cost=0.00..0.02
 rows=1 width=536) (actual time=2338.422..2543.684 rows=62904 loops=1)
 Buffers: shared hit=2290941
   -  CTE Scan on subnet_inhrt_options_asc binoptasc
 (cost=0.00..0.02 rows=1 width=584) (actual time=0.000..9.728 rows=111308
 loops=62904)

 Whereas,  (by altering various statistics to be very wrong) the entire
 query runs in 21 seconds.

 There have been several debates about how to address situations like this
 where
 no practical non-query-specific statistics-gathering scheme can ever hope
 to
 gather enough statistics to model the later derived tables. E.g. the
 frowned-on
 SELECTIVITY clause and ideas for query-specific statistics.

 Meanwhile,   I have one other suggestion aimed specifically at problematic
 CTEs:
 Would it be reasonable to provide a new Planner Configuration option  :

   enable_nestloop_cte_inner (boolean)
   Enables or disables the query planner's use of nested-loop join plans in
 which a CTE is the inner.
   It is impossible to suppress such nested-loop joins entirely,
   but turning this variable off discourages the planner from using one
   if there are other methods available,  such as sorting the CTE for
 merge-join
   or hashing it for hash-join.
   The default is on.

 John






 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-28 Thread Vitalii Tymchyshyn
There is switch-like sql case:
39.6.2.4. Simple CASE

CASE search-expression
WHEN expression [, expression [ ... ]] THEN
  statements
  [ WHEN expression [, expression [ ... ]] THEN
  statements
... ]
  [ ELSE
  statements ]
END CASE;

It should work like C switch statement.

Also, for bulk insert, have you tried for each statement triggers instead
of for each row?
This would look like a lot of inserts and would not be fast in
single-row-insert case, but can give you benefit for huge inserts.
It should look like
insert into quotes_2012_09_10 select * from new where
cast(new.received_time as date) = '2012-09-10' ;
insert into quotes_2012_09_11 select * from new where
cast(new.received_time as date) = '2012-09-11' ;
...

2012/12/27 Stephen Frost sfr...@snowman.net

 * Jeff Janes (jeff.ja...@gmail.com) wrote:
  If the main goal is to make it faster, I'd rather see all of plpgsql get
  faster, rather than just a special case of partitioning triggers.  For
  example, right now a CASE expression statement with 100 branches is
 about
  the same speed as an equivalent list of 100 elsif.  So it seems to be
 doing
  a linear search, when it could be doing a hash that should be a lot
 faster.

 That's a nice thought, but I'm not sure that it'd really be practical.
 CASE statements in plpgsql are completely general and really behave more
 like an if/elsif tree than a C-style switch() statement or similar.  For
 one thing, the expression need not use the same variables, could be
 complex multi-variable conditionals, etc.

 Figuring out that you could build a dispatch table for a given CASE
 statement and then building it, storing it, and remembering to use it,
 wouldn't be cheap.

 On the other hand, I've actually *wanted* a simpler syntax on occation.
 I have no idea if there'd be a way to make it work, but this would be
 kind of nice:

 CASE OF x -- or whatever
   WHEN 1 THEN blah blah
   WHEN 2 THEN blah blah
   WHEN 3 THEN blah blah
 END

 which would be possible to build into a dispatch table by looking at the
 type of x and the literals used in the overall CASE statement.  Even so,
 there would likely be some number of WHEN conditions required before
 it'd actually be more efficient to use, though perhaps getting rid of
 the expression evaluation (if that'd be possible) would make up for it.

 Thanks,

 Stephen




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-28 Thread Vitalii Tymchyshyn
BTW: If select count(*) from new is fast, you can even choose the
strategy in trigger depending on insert size.


2012/12/28 Vitalii Tymchyshyn tiv...@gmail.com

 There is switch-like sql case:
 39.6.2.4. Simple CASE

 CASE search-expression
 WHEN expression [, expression [ ... ]] THEN
   statements
   [ WHEN expression [, expression [ ... ]] THEN
   statements
 ... ]
   [ ELSE
   statements ]
 END CASE;

 It should work like C switch statement.

 Also, for bulk insert, have you tried for each statement triggers
 instead of for each row?
 This would look like a lot of inserts and would not be fast in
 single-row-insert case, but can give you benefit for huge inserts.
 It should look like
 insert into quotes_2012_09_10 select * from new where
 cast(new.received_time as date) = '2012-09-10' ;
 insert into quotes_2012_09_11 select * from new where
 cast(new.received_time as date) = '2012-09-11' ;
 ...

 2012/12/27 Stephen Frost sfr...@snowman.net

 * Jeff Janes (jeff.ja...@gmail.com) wrote:
  If the main goal is to make it faster, I'd rather see all of plpgsql get
  faster, rather than just a special case of partitioning triggers.  For
  example, right now a CASE expression statement with 100 branches is
 about
  the same speed as an equivalent list of 100 elsif.  So it seems to be
 doing
  a linear search, when it could be doing a hash that should be a lot
 faster.

 That's a nice thought, but I'm not sure that it'd really be practical.
 CASE statements in plpgsql are completely general and really behave more
 like an if/elsif tree than a C-style switch() statement or similar.  For
 one thing, the expression need not use the same variables, could be
 complex multi-variable conditionals, etc.

 Figuring out that you could build a dispatch table for a given CASE
 statement and then building it, storing it, and remembering to use it,
 wouldn't be cheap.

 On the other hand, I've actually *wanted* a simpler syntax on occation.
 I have no idea if there'd be a way to make it work, but this would be
 kind of nice:

 CASE OF x -- or whatever
   WHEN 1 THEN blah blah
   WHEN 2 THEN blah blah
   WHEN 3 THEN blah blah
 END

 which would be possible to build into a dispatch table by looking at the
 type of x and the literals used in the overall CASE statement.  Even so,
 there would likely be some number of WHEN conditions required before
 it'd actually be more efficient to use, though perhaps getting rid of
 the expression evaluation (if that'd be possible) would make up for it.

 Thanks,

 Stephen




 --
 Best regards,
  Vitalii Tymchyshyn




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-28 Thread Vitalii Tymchyshyn
Why so? Basic form case lvalue when rvalue then out ... end is much like
switch.
The case when condition then out ... end is different, more complex
beast, but first one is essentially a switch. If it is now trnasformed into
case when lvalue = rvalue1 then out1 when lvalue=rvalue2 then out2 ...
end then this can be optimized and this would benefit many users, not only
ones that use partitioning.


2012/12/28 Stephen Frost sfr...@snowman.net

 Vitalii,

 * Vitalii Tymchyshyn (tiv...@gmail.com) wrote:
  There is switch-like sql case:
 [...]
  It should work like C switch statement.

 It does and it doesn't.  It behaves generally like a C switch statement,
 but is much more flexible and therefore can't be optimized like a C
 switch statement can be.

 Thanks,

 Stephen




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-28 Thread Vitalii Tymchyshyn
It's a pity. Why does not it listed in Compatibility section of create
trigger documentation? I think, this makes for each statement triggers
not compatible with SQL99.


2012/12/28 Pavel Stehule pavel.steh...@gmail.com

 Hello

 
  Also, for bulk insert, have you tried for each statement triggers
 instead
  of for each row?
  This would look like a lot of inserts and would not be fast in
  single-row-insert case, but can give you benefit for huge inserts.
  It should look like
  insert into quotes_2012_09_10 select * from new where
 cast(new.received_time
  as date) = '2012-09-10' ;
  insert into quotes_2012_09_11 select * from new where
 cast(new.received_time
  as date) = '2012-09-11' ;
  ...

 It has only one problem - PostgreSQL has not relations NEW and OLD for
 statements triggers.

 Regards

 Pavel




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Slow query: bitmap scan troubles

2012-12-04 Thread Vitalii Tymchyshyn
Well, you don't need to put anything down. Most settings that change
planner decisions can be tuned on per-quey basis by issuing set commands in
given session. This should not affect other queries more than it is needed
to run query in the way planner chooses.

Best regards, Vitalii Tymchyshyn


2012/12/4 postgre...@foo.me.uk


  But the row estimates are not precise at the top of the join/filter.
  It thinks there will 2120 rows, but there are only 11.

 Ah... I didn't spot that one...

 Yes, you are right there - this is probably a slightly atypical query of
 this sort actually, 2012 is a pretty good guess.

 On Claudio's suggestion I have found lots more things to read up on and am
 eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
 effective_work_mem setting is going from 6Gb-88Gb which I think will make
 quite a difference.

 I still can't quite wrap around my head why accessing an index is expected
 to use more disk access than doing a bitmap scan of the table itself, but I
 guess it does make a bit of sense if postgres assumes the table is more
 likely to be cached.

 It's all quite, quite fascinating :)

 I'll let you know how it goes.

 - Phil



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Optimize update query

2012-12-02 Thread Vitalii Tymchyshyn
Well, it seems that my data can be outdated, sorry for that. I've just
checked performance numbers on Tom's hardware and it seems that best sad
really do 500 MB/s. Some others do 100. So, I'd say one must choose wisely
(as always :-) ).

Best regards,
Vitalii Tymchyshyn
1 груд. 2012 00:43, Mark Kirkwood mark.kirkw...@catalyst.net.nz напис.

 Hmm - not strictly true as stated: 1 SSD will typically do 500MB/s
 sequential read/write. 1 HDD will be lucky to get a 1/3 that.

 We are looking at replacing 4 to 6 disk RAID10 arrays of HDD with a RAID1
 pair of SSD, as they perform about the same for sequential work and vastly
 better at random. Plus they only use 2x 2.5 slots (or, ahem 2x PCIe
 sockets), so allow smaller form factor servers and save on power and
 cooling.

 Cheers

 Mark

 On 30/11/12 23:07, Vitalii Tymchyshyn wrote:

 Oh, yes. I don't imagine DB server without RAID+BBU :)
 When there is no BBU, SSD can be handy.
 But you know, SSD is worse in linear read/write than HDD.

 Best regards, Vitalii Tymchyshyn


 2012/11/30 Mark Kirkwood mark.kirkw...@catalyst.net.nz
 mailto:mark.kirkwood@**catalyst.net.nz mark.kirkw...@catalyst.net.nz

 Most modern SSD are much faster for fsync type operations than a
 spinning disk - similar performance to spinning disk + writeback
 raid controller + battery.

 However as you mention, they are great at random IO too, so Niels,
 it might be worth putting your postgres logs *and* data on the SSDs
 and retesting.





Re: [PERFORM] Optimize update query

2012-11-30 Thread Vitalii Tymchyshyn
Actually, what's the point in putting logs to ssd? SSDs are good for random
access and logs are accessed sequentially. I'd put table spaces on ssd and
leave logs on hdd
30 лист. 2012 04:33, Niels Kristian Schjødt nielskrist...@autouncle.com
напис.

 Hmm I'm getting suspicious here. Maybe my new great setup with the SSD's
 is not really working as it should., and maybe new relic is not monitoring
 as It should.

 If I do a sudo iostat -k 1
 I get a lot of output like this:
 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
 sda   0.00 0.00 0.00  0  0
 sdb   0.00 0.00 0.00  0  0
 sdc 546.00  2296.00  6808.00   2296   6808
 sdd 593.00  1040.00  7416.00   1040   7416
 md1   0.00 0.00 0.00  0  0
 md0   0.00 0.00 0.00  0  0
 md21398.00  3328.00 13064.00   3328  13064
 md3   0.00 0.00 0.00  0  0

 The storage thing is, that the sda and sdb is the SSD drives and the sdc
 and sdd is the HDD drives. The md0, md1 and md2 is the raid arrays on the
 HDD's and the md3 is the raid on the SSD's. Neither of the md3 or the SSD's
 are getting utilized - and I should expect that since they are serving my
 pg_xlog right? - so maybe I did something wrong in the setup. Here is the
 path I followed:

 # 1) First setup the SSD drives in a software RAID1 setup:
 #
 http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
 #
 # 2) Then move the postgres pg_xlog dir
 #   sudo /etc/init.d/postgresql-9.2 stop
 #   sudo mkdir -p /ssd/pg_xlog
 #   sudo chown -R  postgres.postgres /ssd/pg_xlog
 #   sudo chmod 700 /ssd/pg_xlog
 #   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
 #   sudo mv /var/lib/postgresql/9.2/main/pg_xlog
 /var/lib/postgresql/9.2/main/pg_xlog_old
 #   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
 #   sudo /etc/init.d/postgresql-9.2 start

 Can you spot something wrong?



 Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt 
 nielskrist...@autouncle.com:

  Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com:
 
  Niels Kristian Schjødt wrote:
 
  Okay, now I'm done the updating as described above. I did the
  postgres.conf changes. I did the kernel changes, i added two
  SSD's in a software RAID1 where the pg_xlog is now located -
  unfortunately the the picture is still the same :-(
 
  You said before that you were seeing high disk wait numbers. Now it
  is zero accourding to your disk utilization graph. That sounds like
  a change to me.
 
  When the database is under heavy load, there is almost no
  improvement to see in the performance compared to before the
  changes.
 
  In client-visible response time and throughput, I assume, not
  resource usage numbers?
 
  A lot of both read and writes takes more than a 1000 times as
  long as they usually do, under lighter overall load.
 
  As an odd coincidence, you showed your max_connections setting to
  be 1000.
 
  http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
 
  -Kevin
 
  Hehe, I'm sorry if it somehow was misleading, I just wrote a lot of
 I/O it was CPU I/O, it also states that in the chart in the link.
  However, as I'm not very familiar with these deep down database and
 server things, I had no idea wether a disk bottle neck could hide in this
 I/O, so i went along with Shauns great help, that unfortunately didn't
 solve my issues.
  Back to the issue: Could it be that it is the fact that I'm using
 ubuntus built in software raid to raid my disks, and that it is not at all
 capable of handling the throughput?
 



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Optimize update query

2012-11-30 Thread Vitalii Tymchyshyn
Oh, yes. I don't imagine DB server without RAID+BBU :)
When there is no BBU, SSD can be handy.
But you know, SSD is worse in linear read/write than HDD.

Best regards, Vitalii Tymchyshyn


2012/11/30 Mark Kirkwood mark.kirkw...@catalyst.net.nz

 Most modern SSD are much faster for fsync type operations than a spinning
 disk - similar performance to spinning disk + writeback raid controller +
 battery.

 However as you mention, they are great at random IO too, so Niels, it
 might be worth putting your postgres logs *and* data on the SSDs and
 retesting.

 Regards

 Mark




 On 30/11/12 21:37, Vitalii Tymchyshyn wrote:

 Actually, what's the point in putting logs to ssd? SSDs are good for
 random access and logs are accessed sequentially. I'd put table spaces
 on ssd and leave logs on hdd

 30 лист. 2012 04:33, Niels Kristian Schjødt
 nielskrist...@autouncle.com 
 mailto:nielskristian@**autouncle.comnielskrist...@autouncle.com
 напис.


 Hmm I'm getting suspicious here. Maybe my new great setup with the
 SSD's is not really working as it should., and maybe new relic is
 not monitoring as It should.

 If I do a sudo iostat -k 1
 I get a lot of output like this:
 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
 sda   0.00 0.00 0.00  0  0
 sdb   0.00 0.00 0.00  0  0
 sdc 546.00  2296.00  6808.00   2296   6808
 sdd 593.00  1040.00  7416.00   1040   7416
 md1   0.00 0.00 0.00  0  0
 md0   0.00 0.00 0.00  0  0
 md21398.00  3328.00 13064.00   3328  13064
 md3   0.00 0.00 0.00  0  0

 The storage thing is, that the sda and sdb is the SSD drives and the
 sdc and sdd is the HDD drives. The md0, md1 and md2 is the raid
 arrays on the HDD's and the md3 is the raid on the SSD's. Neither of
 the md3 or the SSD's are getting utilized - and I should expect that
 since they are serving my pg_xlog right? - so maybe I did something
 wrong in the setup. Here is the path I followed:

 # 1) First setup the SSD drives in a software RAID1 setup:
 #
 http://askubuntu.com/**questions/223194/setup-of-two-**
 additional-ssd-drives-in-raid-**1http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
 #
 # 2) Then move the postgres pg_xlog dir
 #   sudo /etc/init.d/postgresql-9.2 stop
 #   sudo mkdir -p /ssd/pg_xlog
 #   sudo chown -R  postgres.postgres /ssd/pg_xlog
 #   sudo chmod 700 /ssd/pg_xlog
 #   sudo cp -rf /var/lib/postgresql/9.2/main/**pg_xlog/* /ssd/pg_xlog
 #   sudo mv /var/lib/postgresql/9.2/main/**pg_xlog
 /var/lib/postgresql/9.2/main/**pg_xlog_old
 #   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/**pg_xlog
 #   sudo /etc/init.d/postgresql-9.2 start

 Can you spot something wrong?



 Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt
 nielskrist...@autouncle.com 
 mailto:nielskristian@**autouncle.comnielskrist...@autouncle.com
 :


   Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com
 mailto:kgri...@mail.com:

  
   Niels Kristian Schjødt wrote:
  
   Okay, now I'm done the updating as described above. I did the
   postgres.conf changes. I did the kernel changes, i added two
   SSD's in a software RAID1 where the pg_xlog is now located -
   unfortunately the the picture is still the same :-(
  
   You said before that you were seeing high disk wait numbers. Now
 it
   is zero accourding to your disk utilization graph. That sounds
 like
   a change to me.
  
   When the database is under heavy load, there is almost no
   improvement to see in the performance compared to before the
   changes.
  
   In client-visible response time and throughput, I assume, not
   resource usage numbers?
  
   A lot of both read and writes takes more than a 1000 times as
   long as they usually do, under lighter overall load.
  
   As an odd coincidence, you showed your max_connections setting to
   be 1000.
  
   http://wiki.postgresql.org/**wiki/Number_Of_Database_**
 Connectionshttp://wiki.postgresql.org/wiki/Number_Of_Database_Connections
  
   -Kevin
  
   Hehe, I'm sorry if it somehow was misleading, I just wrote a lot
 of I/O it was CPU I/O, it also states that in the chart in the link.
   However, as I'm not very familiar with these deep down database
 and server things, I had no idea wether a disk bottle neck could
 hide in this I/O, so i went along with Shauns great help, that
 unfortunately didn't solve my issues.
   Back to the issue: Could it be that it is the fact

Re: [PERFORM] Optimize update query

2012-11-30 Thread Vitalii Tymchyshyn
SSDs are not faster for sequential IO as I know. That's why (with BBU or
synchronious_commit=off) I prefer to have logs on regular HDDs.

Best reag


2012/11/30 Willem Leenen willem_lee...@hotmail.com


  Actually, what's the point in putting logs to ssd? SSDs are good for
 random access and logs are accessed sequentially. I'd put table spaces on
 ssd and leave logs on hdd
  30 лист. 2012 04:33, Niels Kristian Schjødt 
 nielskrist...@autouncle.com напис.
 Because SSD's are considered faster. Then you have to put the most
 phyisical IO intensive operations on SSD. For the majority of databases,
 these are the logfiles. But you should investigate where the optimum is for
 your situation.





-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Database design - best practice

2012-11-28 Thread Vitalii Tymchyshyn
Let me be devil advocate here :)
First of all, even if you read any basics about normalization, don't take
it to your heart :) Think.
Know that each normalization/denormalization step has it's cons and pros.
E.g. in NoSQL world they often don't normalize much.
What's interesting with PosgreSQL is that it is suited quite good for
NoSQL-like scenarios.
First of all, each unfilled (null) data column takes 1 bit only. This, BTW,
leads to interesting consequence that performance-wise it can be better to
have null/true boolean than false/true. Especially if you've got a lot of
false.
So, PostgreSQL should be good with 10th, possible 100th of data column with
most columns empty. Record of 151 null columns would take header +
roundup(151/8 ) = 19 bytes. Not much. NoSQLs usually put column names into
records and this costs more.
Any null columns at the end of record take no space at all (so, you can
think on reordering your columns to put the least used to the record end).
Adding column with null as default is cheap operation that do not require
table scan.
You can have partial indexes to speed things up, like create index on car
(car_id) where (has_automatic_transmission);

At the other side, when you normalize you need to join. Instead of select *
from car where has_automatic_transmission (that will use index above), you
will have to select * from car where id in (select id from
car_with_automatic_transmission). The plan is much more complex here. It
will be slower.

The main normalization plus for you is that you work with record as a
whole, so if there is a lot of information in there that is rarely used,
you will pay for it's access every time, both on selects and updates.

So, as conclusion, I agree with others, that you should check. But
remember, joining two tables with millions of records os never cheap :)

Best regards, Vitalii Tymchyshyn


2012/11/28 Niels Kristian Schjødt nielskrist...@autouncle.com

 Hi,

 I'm on the hunt for some solid knowledge on a theoretical level about the
 performance of postgresql. My question is regarding best practices, and how
 architectural decisions might influence the performance. First a little
 background:

 The setup:
 I have a database which holds informations on used cars. The database has
 mainly 3 tables of interest for this case:
 A cars table, an adverts table and a sellers table. One car has many
 adverts and one seller has many adverts. One advert belongs to one car and
 one seller.
 The database is powering a website for searching used cars. When searching
 for used cars, the cars table is mainly used, and a lot of the columns
 should be directly available for searching e.g. color, milage, price,
 has_automatic_transmission etc.

 So my main concern is actually about the cars table, since this one
 currently has a lot of columns (151 - I expect thats quite a lot?), and a
 lot of data (4 mil. rows, and growing). Now you might start by thinking,
 this could sound like a regular need for some normalization, but wait a
 second and let me explain :-)
 The columns in this table is for the most very short stings, integers,
 decimals or booleans. So take for an example has_automatic_transmission
 (boolean) I can't see why it would make sense to put that into a separate
 table and join in the values. Or the milage or the price as another
 example. The cars table used for search is indexed quite a lot.

 The questions:
 Having the above setup in mind, what impact on performance, in terms of
 read performance and write performance, does it have, whether I do the
 following:
 1) In general would the read and/or the write on the database be
 faster, if I serialized some of the not searched columns in the table into
 a single text columns instead of let's say 20 booleans?
 2) Lets say I'm updating a timestamp in a single one of the 151
 columns in the cars table. The update statement is using the id to find the
 car. Would the write performance of that UPDATE be affected, if the table
 had fewer columns?
 3) When adding a new column to the table i know that it becomes
 slower the more rows is in the table, but what about the width of the
 table does that affect the performance when adding new columns?
 4) In general what performance downsides do you get when adding a
 lot of columns to one table instead of having them in separate tables?
 5) Is it significantly faster to select * from a table with 20
 columns, than selecting the same 20 in a table with 150 columns?

 Hope there is some good answers out there :-)

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Best regards,
 Vitalii Tymchyshyn


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-22 Thread Vitalii Tymchyshyn
I'd also add ANALYZED/NOT ANALYZED. This should force it behave like
'create table, analyze, select' with statistics used in second query plan.

P.S. defaults can be configurable.
20 лист. 2012 02:22, Gavin Flower gavinflo...@archidevsys.co.nz напис.

 On 15/11/12 15:03, Peter Geoghegan wrote:

 On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:

 It cuts both ways. I have used CTEs a LOT precisely because this
 behaviour
 lets me get better plans. Without that I'll be back to using the offset
 0
 hack.

 Is the OFFSET 0 hack really so bad? We've been telling people to do
 that for years, so it's already something that we've effectively
 committed to.

  How about adding the keywords FENCED and NOT FENCED to the SQL
 definition of CTE's - with FENCED being the default?


 Cheers,
 Gavin



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.**
 org pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-performancehttp://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Postgres delete performance problem

2012-06-25 Thread Vitalii Tymchyshyn

Hello.

This may be wrong type for parameter, like using setObject(param, value) 
instead of setObject(param, value, type). Especially if value passed is 
string object. AFAIR index may be skipped in this case. You can check by 
changing statement to delete from xxx where xxx_pk=?::bigint. If it 
works, check how parameter is set in java code.


25.06.12 18:42, Frits Jalvingh написав(ла):

Hi,

I have a Java application that tries to synchronize tables in two 
databases (remote source to local target). It does so by removing all 
constraints, then it compares table contents row by row, inserts 
missing rows and deletes extra rows in the target database. Delete 
performance is incredibly bad: it handles 100 record deletes in about 
16 to 20 seconds(!). Insert and update performance is fine.


The Java statement to handle the delete uses a prepared statement:

delete from xxx where xxx_pk=?

The delete statement is then executed using addBatch() and 
executeBatch() (the latter every 100 deletes), and committed. Not 
using executeBatch makes no difference.


An example table where deletes are slow:

pzlnew=# \d cfs_file
Table public.cfs_file
Column | Type | Modifiers
--+-+---
cfsid | bigint | not null
cfs_date_created | timestamp without time zone | not null
cfs_name | character varying(512) | not null
cfs_cfaid | bigint |
cfs_cfdid | bigint |
Indexes:
cfs_file_pkey PRIMARY KEY, btree (cfsid)

with no FK constraints at all, and a table size of 940204 rows.

While deleting, postgres takes 100% CPU all of the time.


Inserts and updates are handled in exactly the same way, and these are 
a few orders of magnitude faster than the deletes.


I am running the DB on an Ubuntu 12.04 - 64bits machine with Postgres 
9.1, the machine is a fast machine with the database on ssd, ext4, 
with 16GB of RAM and a i7-3770 CPU @ 3.40GHz.


Anyone has any idea?

Thanks in advance,

Frits




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg 9.1 brings host machine down

2012-06-06 Thread Vitalii Tymchyshyn

Hello.

Seen this already.
It looks like cross join + sort. Badly configured ORM tools like 
Hibernate with multiple one-to-many relationships fetched with 'join' 
strategy may produce such result.
Unfortunately I don't know if it's possible to protect from such a case 
at server side.


Best regards, Vitalii Tymchyshyn

06.06.12 15:05, Konstantin Mikhailov написав(ла):

I'm faced with a problem running postgres 9.1.3 which seems to
nobody else see before. Tried to search and only one relevant
post fond (about millions of files in pgsql_tmp).

Sympthoms:

Some postgres process size is getting abnormally big compared
to other postgres processes. Top shows the 'normal' pg processed
is about VIRT 120m, RES ~30m and SHR ~30m. That one
is about 6500m, 3.4g, 30m corresp. Total RAM avail - 8g.
When one more such a process appears the host going into
deep swap and pg restart can help only (actually the stop
won't even stop such a process - after shutdown it still alive
and can be only killed).

base/pgsql_tmp contains millions of files. In this situation stop
and dirty restart is possible - the normal startup is impossible
either. Read somewhere that it tries to delete (a millions
files) from that directory. I can't even imagine when it finish
the deletion so i'm simple move that folder outside the base
- then start can succeed.

on ubuntu 11.10,12.04 x64. cpu intel core Q9650 3GHz.
8G RAM.

Does anybody see that behaviour or maybe have some glue how to
handle it.

PS: the my preliminary conclusion: some sql is produces
a lot of files in the temporary table spaces - very quickly.
When sql is finished postgres tries to cleanup the folder
reading all contents of the folder and removing the files
one by one. It does the removal slow (watched the folder
by `find pgsql_tmp | wc -l') but process still consumes the
RAM. Next such sql will be a killer :(





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trouble with plan statistics for behaviour for query.

2012-06-01 Thread Vitalii Tymchyshyn
If I am correct, JDBC uses named portal only on the 5th time you use 
PreparedStatement (configurable). Before it uses unnamed thing that 
should work as if you did embed the value. So the solution is to 
recreate PreparedStatement each time (so you will have no problems with 
SQL injection). Note that smart pools may detect this situation and 
reuse PreparedStatement for same query texts internally. If so, this to 
switch this off.
In case you still have problems, I'd recommend you to ask in postgresql 
jdbc mailing list.
Also I've heard that somewhere in 9.2 postgresql server may replan such 
cases each time.


Best regards, Vitalii Tymchyshyn

01.06.12 02:34, Trevor Campbell написав(ла):

Thanks Craig, that certainly leads down the right path.

The following is all done in pgAdmin3:

Using an actual value we I get the plan I expect
explain analyze select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, 
CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, 
CI.NEWSTRING
from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = 
CI.GROUPID where CG.ISSUEID=10006 order by CG.CREATED asc, CI.ID asc


Sort (cost=106.18..106.22 rows=13 width=434) (actual 
time=0.115..0.115 rows=12 loops=1)

 Sort Key: cg.created, ci.id
 Sort Method: quicksort Memory: 29kB
 - Nested Loop (cost=0.00..105.94 rows=13 width=434) (actual 
time=0.019..0.067 rows=12 loops=1)
 - Index Scan using chggroup_issue on changegroup cg 
(cost=0.00..19.73 rows=10 width=29) (actual time=0.009..0.013 rows=10 
loops=1)

 Index Cond: (issueid = 10006::numeric)
 - Index Scan using chgitem_chggrp on changeitem ci (cost=0.00..8.58 
rows=3 width=411) (actual time=0.004..0.005 rows=1 loops=10)

 Index Cond: (groupid = cg.id)
Total runtime: 0.153 ms

Using a prepared statement with a variable , I get a poor plan 
requiring a sequential scan

prepare t2(real) as
select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, 
CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = 
CI.GROUPID where CG.ISSUEID=$1 order by CG.CREATED asc, CI.ID asc;


explain analyze execute t2 (10006);

Sort (cost=126448.89..126481.10 rows=12886 width=434) (actual 
time=1335.615..1335.616 rows=12 loops=1)

 Sort Key: cg.created, ci.id
 Sort Method: quicksort Memory: 29kB
 - Nested Loop (cost=0.00..125569.19 rows=12886 width=434) (actual 
time=0.046..1335.556 rows=12 loops=1)
 - Seq Scan on changegroup cg (cost=0.00..44709.26 rows=10001 
width=29) (actual time=0.026..1335.460 rows=10 loops=1)

 Filter: ((issueid)::double precision = $1)
 - Index Scan using chgitem_chggrp on changeitem ci (cost=0.00..8.05 
rows=3 width=411) (actual time=0.007..0.008 rows=1 loops=10)

 Index Cond: (groupid = cg.id)
Total runtime: 1335.669 ms

Using a prepared statement with a cast of the variable to the right 
type, I get the good plan back

prepare t2(real) as
select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, 
CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = 
CI.GROUPID where CG.ISSUEID=cast($1 as numeric) order by CG.CREATED 
asc, CI.ID asc;


explain analyze execute t2 (10006);

Sort (cost=106.19..106.22 rows=13 width=434) (actual 
time=0.155..0.156 rows=12 loops=1)

 Sort Key: cg.created, ci.id
 Sort Method: quicksort Memory: 29kB
 - Nested Loop (cost=0.00..105.95 rows=13 width=434) (actual 
time=0.048..0.111 rows=12 loops=1)
 - Index Scan using chggroup_issue on changegroup cg 
(cost=0.00..19.73 rows=10 width=29) (actual time=0.031..0.042 rows=10 
loops=1)

 Index Cond: (issueid = ($1)::numeric)
 - Index Scan using chgitem_chggrp on changeitem ci (cost=0.00..8.58 
rows=3 width=411) (actual time=0.006..0.006 rows=1 loops=10)

 Index Cond: (groupid = cg.id)
Total runtime: 0.203 ms

Now the challenge is to get java/jdbc to get this done right. We make 
a big effort to ensure we always use prepared statements and variable 
bindings to help protect from SQL injection vulnerabilities.




On 01/06/12 09:08, Craig James wrote:

I use Perl, not JDBC, but this thread may be relevant to your problem.

http://postgresql.1045698.n5.nabble.com/Slow-statement-when-using-JDBC-td3368379.html 









--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write workload is causing severe slowdown in Production

2012-03-22 Thread Vitalii Tymchyshyn

Check for next messages in your log:
LOG: checkpoints are occurring too frequently (ZZZ seconds apart)
HINT: Consider increasing the configuration parameter checkpoint_segments.

Best regards, Vitalii Tymchyshyn

22.03.12 09:27, Gnanakumar написав(ла):

Hi,

We're running a web-based application powered by PostgreSQL.  Recently,
we've developed a new separate Java-based standalone (daemon process)
threaded program that performs both read and write operations heavily on 2
huge tables.  One table has got 5.4 million records and other has 1.3
million records.  Moreover, more than one read and/or write operations may
be executing concurrently.

The issue that we're facing currently in our Production server is, whenever
this newly developed Java program is started/run, then immediately the
entire web application becomes very slow in response.  At this time, I could
also see from the output of  iostat -tx that %util is even crossing more
than 80%.  So, what I could infer here based on my knowledge is, this is
creating heavy IO traffic because of write operation.  Since it was entirely
slowing down web application, we've temporarily stopped running this
standalone application.

Meantime, I also read about checkpoint spikes could be a reason for slow
down in write workload database.  I'm also reading that starting in
PostgreSQL 8.3, we can get verbose logging of the checkpoint process by
turning on log_checkpoints.

My question is, how do I determine whether checkpoint occurrences are the
root cause of this slowdown in my case?  We're running PostgreSQL v8.2.22 on
CentOS5.2 having 35 GB RAM.  log_checkpoints is not available in
PostgreSQL v8.2.22.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Vitalii Tymchyshyn

27.01.12 11:06, Marti Raudsepp написав(ла):

On Fri, Jan 27, 2012 at 06:31, sridhar bamandlapally
sridhar@gmail.com  wrote:

--
| Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time |
--
|   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13 |
|   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13 |
--

Sorry to take this off topic, but... Seriously, over 3 minutes to read
944 MB of data? That's less than 5 MB/s, what's wrong with your
database? :)
Actually I'd ask how parallel CPU may help table sequence scan? Usually 
sequence scan does not take large amount of cpu time, so I see no point 
in parallelism.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] array_except -- Find elements that are not common to both arrays

2011-09-30 Thread Vitalii Tymchyshyn
Since you are using except and not except all, you are not looking at 
arrays with duplicates.

For this case next function what the fastest for me:

create or replace function array_except2(anyarray,anyarray) returns
anyarray as $$
select ARRAY(
(
select r.elements
from(
(select 1,unnest($1))
union all
(select 2,unnest($2))
) as r (arr, elements)
group by 1
having min(arr)=max(arr)
))
$$ language sql strict immutable;

Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Vitalii Tymchyshyn

17.09.11 23:01, Stefan Keller написав(ла):

* more... ?
What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms 
this would be part of shared buffers dedicated to a relation or a set of 
relations. When you have a big DB (not fitting in memory) you also 
usually want some small tables/indexes be in memory, no matter what 
other load DB has.

Complimentary features are:
1) Relations preloading at startup - ensure this relation are in memory.
2) Per buffer pool (or relation) page costs - tell it that this 
indexes/tables ARE in memory


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Vitalii Tymchyshyn

Hello.

I did read and AFAIR sometimes responded on this long discussions. The 
main point for me is that many DBAs dont want to have even more random 
plans with postgresql knowing what's in memory now and using this 
information directly in runtime. I also think this point is valid.
What I would like to have is to force some relations to be in memory by 
giving them fixed part of shared buffers and to tell postgresql they are 
in memory (lowering page costs) to have fixed optimal plans.


Best regards, Vitalii Tymchyshyn.

19.09.11 14:57, Cédric Villemain написав(ла):

2011/9/19 Vitalii Tymchyshyntiv...@gmail.com:

17.09.11 23:01, Stefan Keller написав(ла):

* more... ?

What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this
would be part of shared buffers dedicated to a relation or a set of
relations. When you have a big DB (not fitting in memory) you also usually
want some small tables/indexes be in memory, no matter what other load DB
has.
Complimentary features are:
1) Relations preloading at startup - ensure this relation are in memory.

you can use pgfincore extension to achieve that, for the OS cache. It
does not look interesting to do that for shared_buffers of postgresql
(the subject has been discussed and can be discussed again, please
check mailling list archieve first)


2) Per buffer pool (or relation) page costs - tell it that this
indexes/tables ARE in memory

you can use tablespace parameters (*_cost) for that, it has been
rejected for tables in the past.
I did propose something to start to work in this direction.
See [WIP] cache estimates, cache access cost in postgresql-hackers
mailling list.

This proposal let inform the planner of the table memory usage and
take that into account.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn

19.09.11 18:19, Robert Klemme написав(ла):

On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncuremmonc...@gmail.com  wrote:


Postgres's hash index implementation used to be pretty horrible -- it
stored the pre-hashed datum in the index which, while making it easier
to do certain things,  made it horribly slow, and, for all intents and
purposes, useless.  Somewhat recently,a lot of work was put in to fix
that -- the index now packs the hash code only which made it
competitive with btree and superior for larger keys.  However, certain
technical limitations like lack of WAL logging and uniqueness hold
hash indexing back from being used like it really should be.  In cases
where I really *do* need hash indexing, I do it in userland.

create table foo
(
  a_long_field text;
);
create index on foo(hash(a_long_field));

select * from foo where hash(a_long_field) = hash(some_value) and
a_long_field = some_value;

This technique works fine -- the main disadvantage is that enforcing
uniqueness is a PITA but since the standard index doesn't support it
either it's no great loss.  I also have the option of getting
'uniqueness' and being able to skip the equality operation if I
sacrifice some performance and choose a strong digest.  Until the hash
index issues are worked out, I submit that this remains the go-to
method to do this.

Is this approach (storing the hash code in a btree) really faster than
a regular btree index on a_long_field?  And if so, for which kind of
data and load?


Actually sometimes the field in [potentially] so long, you can't use 
regular b-tree because it won't fit in the page. Say, it is text type. 
If you will create regular index, you will actually limit column value 
size to few KB. I am using md5(text) indexes in this case coupled with 
rather ugly queries (see above). Native support would be nice.


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn

19.09.11 18:19, Robert Klemme написав(ла):


I still haven't seen a solution to locking when a hash table needs
resizing.  All hashing algorithms I can think of at the moment would
require a lock on the whole beast during the resize which makes this
type of index impractical for certain loads (heavy updating).
Sorry for the second reply, I should have not start writing until I've 
read all your post. Anyway.
Do you need read lock? I'd say readers could use old copy of hash 
table up until the moment new bigger copy is ready. This will simply 
look like the update is not started yet, which AFAIK is OK for MVCC.

Yep, all the writers will wait.

Another option could be to start background build of larger hash - for 
some time your performance will be degraded since you are writing to two 
indexes instead of one plus second one is rebuilding, but I'd say low 
latency solution is possible here.


One more: I don't see actually why can't you have a rolling expand of 
hash table. I will try to describe it correct me if I am wrong:
1) The algorithm I am talking about will take n bits from hash code to 
for hash table. So, during expansion it will double number of baskets.
2) Say, we are going from 2^n = n1 to 2^(n+1) = n2 = n1 * 2 baskets. 
Each new pair of baskets will take data from single source basket 
depending on the value of new hash bit used. E.g. if n were 2, we've had 
4 baskets and new table will have 8 baskets. Everything from old basket 
#1 will go into new baskets #2 and #3 depending on hash value.
3) So, we can have a counter on number of baskets processed. Any 
operation on any lower numbered basket will go to new set. Any 
operation on any higher numbered basket will go to old set. Any 
operation on currently converting basket will block until conversion is 
done.


P.S. Sorry for a lot of possibly dumb thoughts, I don't know why I've 
got such a though stream on this topic :)


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cannot use multicolumn index

2011-09-14 Thread Vitalii Tymchyshyn

14.09.11 18:14, MirrorX написав(ла):

i think in my first post i provided most of these details but -
1) what i expect is to be able to understand why the index is not used and
if possibly to use it somehow, or recreate it in a better way
2) the table has 115 GB and about 700 milion rows
3) the result should be less than 10 millions rows
4) the index is a btree

i tried to disable seq_scan and the query plan was changed and used another
index and not the one i wanted.
You has  check on both columns, this means that it has to scan each 
subtree that satisfy one criteria to check against the other. Here index 
column order is significant. E.g. if you have a lot of xid  100 and xid 
is first index column, it must check all (a lot) the index subtrees for 
xid100.
Multicolumn indexes work best when first columns are checked with = 
and only last column with range criteria.
You may still try to change order of columns in your index if this will 
give best selectivity on first column.
Another option is multiple single column indexes - postgres may merge 
such an indexes at runtime (don't remember since which version this 
feature is available).


Best regards, Vitalii Tymchyshyn.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Rather large LA

2011-09-07 Thread Vitalii Tymchyshyn

Hello.

As it turned out to be iowait, I'd recommend to try to load at least 
some hot relations into FS cache with dd on startup. With a lot of RAM 
on FreeBSD I even sometimes use this for long queries that require a lot 
of index scans.

This converts random IO into sequential IO that is much much faster.
You can try it even while your DB starting - if it works you will see 
IOwait drop and user time raise.
What I do on FreeBSD (as I don't have enough RAM to load all the DB into 
RAM) is:

1) ktrace on backend process[es]. Linux seems to have similar tool
2) Find files that take a lot of long reads
3) dd this files to /dev/null

In this way you can find hot files. As soon as you have them (or if you 
can afford to load everything), you can put dd into startup scripts. Or 
I can imagine an automatic script that will do such things for some time 
after startup.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance die when COPYing to table with bigint PK

2011-08-05 Thread Vitalii Tymchyshyn

05.08.11 11:44, Robert Ayrapetyan написав(ла):

Yes, you are right. Performance become even more awful.
Can some techniques from pg_bulkload be implemented in postgres core?
Current performance is not suitable for any enterprise-wide production system.

BTW: I was thinking this morning about indexes.
How about next feature:
Implement new index type, that will have two zones - old  new. New 
zone is of fixed configurable size, say 100 pages (800 K).
Any search goes into both zones. So, as soon as index is larger then 
800K, the search must be done twice.
As soon as new zone hit's it's size limit, part (may be only one?) of 
it's pages are merged with old zone. The merge is rolling - if last 
merge've stopped at X entry, next merge will start at entry right after X.


As for me, this should greatly resolve large index insert problem:
1) Insert into new zone must be quick because it's small and hot in cache.
2) During merge writes will be grouped because items with near keys (for 
B-tree) or hashes (for hash index) will go to small subset of old zone 
pages. In future, merge can be also done by autovacuum in background.
Yes, we get dual index search, but new zone will be hot, so this won't 
make it twice as costly.


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance die when COPYing to table with bigint PK

2011-08-04 Thread Vitalii Tymchyshyn

04.08.11 18:59, Kevin Grittner написав(ла):

Robert Ayrapetyanrobert.ayrapet...@comodo.com  wrote:

Kevin Grittnerkevin.gritt...@wicourts.gov  wrote:



[regarding tests which do show the problem]
tried same with 2 columns (bigint and int) - it didn't produced
such effect probably because data volume has critical effect.


Based on what you're showing, this is almost certainly just a matter
of pushing your volume of active data above the threshold of what
your cache holds, forcing it to do disk access rather than RAM
access for a significant portion of the reads.

-Kevin

Yep. Seems so. Plus famous you'd better insert data, then create indexes.
On my database it takes twice the time for int8 then for int4 to insert 
data.
Also it takes ~twice a time (2 hours) to add 200K of rows to 200M of 
rows than to make an index over 200M of rows (1 hour).


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance die when COPYing to table with bigint PK

2011-08-02 Thread Vitalii Tymchyshyn

02.08.11 11:26, Robert Ayrapetyan написав(ла):

Seems this assumption is not right. Just created simple index on
bigint column - situation with huge performance
degradation repeated. Dropping this index solved COPY issues on the fly.
So I'm still convinced - this bug relates to FreeBSD 64-bit + UFS +
bigint column index
(some of these may be superfluous, but I have no resources to check on
different platforms with different filesystems).
Inteesting. We also have FreeBSDx64 on UFS and are using bigint 
(bigserial) keys. It seems I will need to perform more tests here 
because I do see similar problems. I for sure can do a copy of data with 
int4 keys and test the performance.
BTW: The thing we are going to try on next upgrade is to change UFS 
block size from 16K to 8K. What problem I saw is that with default 
setting, UFS needs to read additional 8K when postgresql writes it's 
page (and for index random writes can be vital). Unfortunately, such a 
changes requires partition reformat and I can't afford it for now.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert

2011-08-01 Thread Vitalii Tymchyshyn

Hello.

Please note that in multitasking environment you may have problems with 
your code. Two connections may check if a is available and if not (and 
both got empty select result), try to insert. One will succeed, 
another will fail if you have a unique constraint on category name (and 
you'd better have one).


Please note that select for update won't help you much, since this is 
new record you are looking for, and select don't return (and lock) it. I 
am using lock table tableName in SHARE ROW EXCLUSIVE mode in this case.


But then, if you have multiple lookup dictinaries, you need to ensure 
strict order of locking or you will be getting deadlocks. As for me, I 
did create a special application-side class to retrieve such values. If 
I can't find a value in main connection with simple select, I open new 
connection, perform table lock, check if value is in there. If it is 
not, add the value and commit. This may produce orphaned dictionary 
entries (if dictionary entry is committed an main transaction is rolled 
back), but this is usually OK for dictionaries. At the same time I don't 
introduce hard locks into main transaction and don't have to worry about 
deadlocks.


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance die when COPYing to table with bigint PK

2011-08-01 Thread Vitalii Tymchyshyn

31.07.11 16:51, Robert Ayrapetyan написав(ла):

Hello.

I've found strange behavior of my pg installation (tested both 8.4 and
9.0 - they behave same) on FreeBSD platform.
In short - when some table have PK on bigint field - COPY to that
table from file becomes slower and slower as table grows. When table
reaches ~5GB - COPY of 100k records may take up to 20 mins. I've
experimented with all params in configs, moved indexes to separate hdd
etc - nothing made any improvement. However, once I'm dropping 64 bit
PK - COPY of 100k records passes in seconds. Interesting thing - same
table has other indexes, including composite ones, but none of them
include bigint fields, that's why I reached decision that bug
connected with indexes on bigint fields only.
I did see this behavior, but as for me it occurs for UNIQUE indexes only 
(including PK), not dependent on field type.
You can check this by dropping PK and creating it as a regular 
non-unique index.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Long Running Update

2011-06-24 Thread Vitalii Tymchyshyn

24.06.11 14:16, Harry Mantheakis написав(ла):


 EXPLAIN the statement

Here is the EXPLAIN result:

--
QUERY PLAN
--
Hash Join (cost=2589312.08..16596998.47 rows=74558048 width=63)
Hash Cond: (table_A.id = table_B.id)
- Seq Scan on table_A(cost=0.00..1941825.05 rows=95612705 width=47)
- Hash (cost=1220472.48..1220472.48 rows=74558048 width=20)
- Seq Scan on table_B(cost=0.00..1220472.48 rows=74558048 width=20)
--

The documentation says the 'cost' numbers are 'units of disk page 
fetches'.


Do you, by any chance, have any notion of how many disk page fetches 
can be processed per second in practice - at least a rough idea?


IOW how do I convert - guesstimate! - these numbers into (plausible) 
time values?

No chance. This are virtual values for planner only.
If I read correctly, your query should go into two phases: build hash 
map on one table, then update second table using the map. Not that this 
all valid unless you have any constraints (including foreign checks, 
both sides) to check on any field of updated table. If you have, you'd 
better drop them.
Anyway, this is two seq. scans. For a long query I am using a tool like 
ktrace (freebsd) to get system read/write calls backend is doing. Then 
with catalog tables you can map file names to relations 
(tables/indexes). Then you can see which stage you are on and how fast 
is it doing.
Note that partially cached tables are awful (in FreeBSD, dunno for 
linux) for such a query - I suppose this is because instead on 
sequential read, you get a lot of random reads that fools prefetch 
logic. dd if=table_file of=/dev/null bs=8m helps me a lot. You can see 
it it helps if CPU time goes up.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Oracle v. Postgres 9.0 query performance

2011-06-08 Thread Vitalii Tymchyshyn

08.06.11 18:40, Tony Capobianco написав(ла):

pg_dw=# set enable_nestloop =0;
SET
Time: 0.165 ms
pg_dw=# explain CREATE TABLE ecr_opens with (FILLFACTOR=100)
pg_dw-# as
pg_dw-# select o.emailcampaignid, count(memberid) opencnt
pg_dw-#   from openactivity o,ecr_sents s
pg_dw-#  where s.emailcampaignid = o.emailcampaignid
pg_dw-#  group by o.emailcampaignid;
QUERY
PLAN
-
  HashAggregate  (cost=4391163.81..4391288.05 rows=9939 width=12)
-   Hash Join  (cost=14.78..4344767.23 rows=9279316 width=12)
  Hash Cond: (o.emailcampaignid = s.emailcampaignid)
  -   Seq Scan on openactivity o  (cost=0.00..3529930.67
rows=192540967 width=12)
  -   Hash  (cost=8.79..8.79 rows=479 width=4)
-   Seq Scan on ecr_sents s  (cost=0.00..8.79 rows=479
width=4)

Yikes.  Two sequential scans.


Yep. Can you see another options? Either you take each of 479 records 
and try to find matching records in another table using index (first 
plan), or you take both two tables fully (seq scan) and join - second plan.
First plan is better if your large table is clustered enough on 
emailcampaignid field (479 index reads and 479 sequential table reads). 
If it's not, you may get a 479 table reads transformed into a lot or 
random reads.
BTW: May be you have different data clustering in PostgreSQL  Oracle? 
Or data in Oracle may be hot in caches?
Also, sequential scan is not too bad thing. It may be cheap enough to 
read millions of records if they are not too wide. Please show select 
pg_size_pretty(pg_relation_size('openactivity')); Have you tried to 
explain analyze second plan?


Best regards, Vitalii Tymchyshyn




On Wed, 2011-06-08 at 11:33 -0400, Tom Lane wrote:

Tony Capobiancotcapobia...@prospectiv.com  writes:

pg_dw=# explain CREATE TABLE ecr_opens with (FILLFACTOR=100)
pg_dw-# as
pg_dw-# select o.emailcampaignid, count(memberid) opencnt
pg_dw-#   from openactivity o,ecr_sents s
pg_dw-#  where s.emailcampaignid = o.emailcampaignid
pg_dw-#  group by o.emailcampaignid;
  QUERY
PLAN
-
  GroupAggregate  (cost=0.00..1788988.05 rows=9939 width=12)
-   Nested Loop  (cost=0.00..1742467.24 rows=9279316 width=12)
  -   Index Scan using ecr_sents_ecid_idx on ecr_sents s
(cost=0.00..38.59 rows=479 width=4)
  -   Index Scan using openact_emcamp_idx on openactivity o
(cost=0.00..3395.49 rows=19372 width=12)
Index Cond: (o.emailcampaignid = s.emailcampaignid)
(5 rows)
Should this query be hashing the smaller table on Postgres rather than
using nested loops?

Yeah, seems like it.  Just for testing purposes, do set enable_nestloop
= 0 and see what plan you get then.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 8.4/9.0 simple query performance regression

2011-06-07 Thread Vitalii Tymchyshyn

07.06.11 00:45, Josh Berkus написав(ла):

All,

Just got this simple case off IRC today:

8.4.4
This plan completes in 100ms:
Filter: (NOT (hashed SubPlan 1))



9.0.2
This plan does not complete in 15 minutes or more:
Filter: (NOT (SubPlan 1))

Hashed is the key. Hashed subplans usually has much better performance.
You need to increase work_mem. I suppose it is in default state as you 
need not too much memory for hash of 70K integer values.
BTW: Why do it want to materialize a result of seq scan without filter. 
I can see no benefits (or is it more narrow rows?)


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-05-25 Thread Vitalii Tymchyshyn

24.05.11 21:48, Greg Smith написав(ла):


Bitmap heap scan: Here, the exact list of blocks to fetch is known in 
advance, they're random, and it's quite possible for the kernel to 
schedule them more efficiently than serial access of them can do. This 
was added as the effective_io_concurrency feature (it's the only thing 
that feature impacts), which so far is only proven to work on Linux. 
Any OS implementing the POSIX API used will also get this however; 
FreeBSD was the next likely candidate that might benefit when I last 
looked around.

FreeBSD unfortunately do not have the support :(
It has AIO, but does not have the call needed to enable this settings.

Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan

2011-05-24 Thread Vitalii Tymchyshyn

Hello.

As of me, all this hot thing really looks like uncertain and dynamic 
enough.
Two things that I could directly use right now (and they are needed in 
pair) are:
1)Per-table/index/database bufferpools (split shared buffer into parts, 
allow to specify which index/table/database goes where)

2)Per-table/index cost settings

If I had this, I could allocate specific bufferpools for tables/indexes 
that MUST be hot in memory and set low costs for this specific tables.
P.S. Third thing, great to have to companion this two is Load on 
startup flag to automatically populate bufferpools with fast sequential 
read, but this can be easily emulated with a statement.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres refusing to use 1 core

2011-05-12 Thread Vitalii Tymchyshyn

12.05.11 06:18, Aren Cambre ???(??):


 Using one thread, the app can do about 111 rows per second, and it's
 only exercising 1.5 of 8 CPU cores while doing this. 12,000,000
rows /
 111 rows per second ~= 30 hours.

I don't know how I missed that. You ARE maxing out one cpu core, so
you're quite right that you need more threads unless you can make your
single worker more efficient.


And the problem is my app already has between 20 and 30 threads. 
Something about C#'s PLINQ may not be working as intended...


Have you checked that you are really doing fetch and processing in 
parallel? Dunno about C#, but under Java you have to make specific 
settings (e.g. setFetchSize) or driver will fetch all the data on query 
run. Check time needed to fetch first row from the query.


Best regards, Vitalii Tymchyshyn


Re: [PERFORM] Shouldn't we have a way to avoid risky plans?

2011-03-25 Thread Vitalii Tymchyshyn

24.03.11 20:41, Merlin Moncure написав(ла):

2011/3/24 Віталій Тимчишинtiv...@gmail.com:


This can se GUC-controllable. Like plan_safety=0..1 with low default value.
This can influence costs of plans where cost changes dramatically with small
table changes and/or statistics is uncertain. Also this can be used as
direct hint for such dangerous queries by changing GUC for session/single
query.

ISTM if you add statistics miss and 'risk margin' to the things the
planner would have to consider while generating a plan, you are
greatly increasing the number of plan paths that would have to be
considered for any non trivial query.
Why so? I simply change cost estimation functions. This won't change 
number of pathes.


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Shouldn't we have a way to avoid risky plans?

2011-03-25 Thread Vitalii Tymchyshyn

25.03.11 16:12, Tom Lane написав(ла):

Vitalii Tymchyshyntiv...@gmail.com  writes:


Why so? I simply change cost estimation functions. This won't change
number of pathes.

If you have multiple figures of merit, that means you have to keep more
paths, with consequent slowdown when it comes to choosing which path to
use at higher join levels.

As an example, we used to keep only the paths with best total cost.
When we started to optimize LIMIT, we had to keep around paths with best
startup cost too, in case that made for the best combined result at a
higher join level.  If you're going to consider risk while choosing
paths, that means you'll start keeping paths you would have discarded
before, while not necessarily getting rid of any other paths.  The only
way to avoid that would be to have a completely brain-dead notion of
risk that wasn't affected by how the path is used at a higher join
level, and I'm pretty sure that that wouldn't solve anybody's problem.

Any significant expansion of the planner's fundamental cost model *will*
make it slower.  By a lot.  Rather than going into this with fantasies
of it won't cost anything, you should be worrying about how to keep
the speed penalty to factor-of-two rather than factor-of-ten.
But I am not talking about model change, it's more like formula change. 
Introducing limit added one variable where outer plan could influence 
inner plan selection.
But I am talking simply about cost calculation for given node. Now cost 
is based on statistical expected value, the proposal is (something like) 
to take maximum cost on n% probability range near expected value.
This, of course, will make calculations slower, but won't add any degree 
of freedom to calculations.


Best regards, Vitalii Tymchyshyn



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Vitalii Tymchyshyn

23.03.11 08:28, Adarsh Sharma ???(??):

*
*I perform a join query on it as :

* explain analyze select distinct(p.crawled_page_id) from page_content 
p , clause2  c where p.crawled_page_id != c.source_id ;*
Your query is wrong. This query will return every *crawled_page_id* if 
clause2 has more then 1 source_id. This is because DB will be able to 
find clause with source_id different from crawled_page_id. You need to 
use not exists or not in.


Best regards, Vitalii Tymchyshyn.


Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn

23.03.11 11:17, Adarsh Sharma ???(??):


I think it is very much faster but I don't understand the query :

*explain select distinct(b) from t1,t2 where t1.b t2.d union all  
select distinct(b) from t1,t2 where  t1.b t2.d;

*

I don't understand it too. What are you trying to get? Is it
select distinct(b) from t1 where  b  (select min(d) from t2)**or b  
(select max(d) from t2)

?

Can you explain in words, not SQL, what do you expect do retrieve?

Best regards, Vitalii Tymchyshyn


Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn

23.03.11 12:10, Adarsh Sharma ???(??):
I just want to retrieve that id 's from page_content which do not have 
any entry in clause2 table.



Then
select distinct(p.crawled_page_id) from page_content p
 where NOT EXISTS (select 1 from  clause2 c where c.source_id = 
p.crawled_page_id);

is correct query.

Best regards, Vitalii Tymchyshyn.


Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn

23.03.11 12:19, Adarsh Sharma ???(??):

Vitalii Tymchyshyn wrote:

23.03.11 12:10, Adarsh Sharma ???(??):
I just want to retrieve that id 's from page_content which do not 
have any entry in clause2 table.



Then
select distinct(p.crawled_page_id) from page_content p
 where NOT EXISTS (select 1 from  clause2 c where c.source_id = 
p.crawled_page_id);

is correct query.



I can't understand how*select 1 from  clause2 c where c.source_id = 
p.crawled_page_id works too, *i get my output .


What is the significance of 1 here.
No significance. You can put anything there. E.g. *. Simply arbitrary 
constant. Exists checks if there were any rows, it does not matter which 
columns are there or what is in this columns.


Best regards, Vitalii Tymchyshyn



Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn

23.03.11 13:21, Adarsh Sharma ???(??):

Thank U all, for U'r Nice Support.

Let me Conclude the results, below results are obtained after finding 
the needed queries :


*First Option :

*pdc_uima=# explain analyze select distinct(p.crawled_page_id)
pdc_uima-# from page_content p left join clause2 c on (p.crawled_page_id =
pdc_uima(# c.source_id) where (c.source_id is null);
 
QUERY PLAN

-
 HashAggregate  (cost=100278.16..104104.75 rows=382659 width=8) 
(actual time=87927.000..87930.084 rows=72 loops=1)
   -  Nested Loop Anti Join  (cost=0.00..99320.46 rows=383079 
width=8) (actual time=0.191..87926.546 rows=74 loops=1)
 -  Seq Scan on page_content p  (cost=0.00..87132.17 
rows=428817 width=8) (actual time=0.027..528.978 rows=428467 loops=1)
 -  Index Scan using idx_clause2_source_id on clause2 c  
(cost=0.00..18.18 rows=781 width=4) (actual time=0.202..0.202 rows=1 
loops=428467)

   Index Cond: (p.crawled_page_id = c.source_id)
 Total runtime: 87933.882 ms:-(
(6 rows)

*Second Option :

*pdc_uima=# explain analyze select distinct(p.crawled_page_id) from 
page_content p
pdc_uima-#  where NOT EXISTS (select 1 from  clause2 c where 
c.source_id = p.crawled_page_id);
 
QUERY PLAN

-
 HashAggregate  (cost=100278.16..104104.75 rows=382659 width=8) 
(actual time=7047.259..7050.261 rows=72 loops=1)
   -  Nested Loop Anti Join  (cost=0.00..99320.46 rows=383079 
width=8) (actual time=0.039..7046.826 rows=74 loops=1)
 -  Seq Scan on page_content p  (cost=0.00..87132.17 
rows=428817 width=8) (actual time=0.008..388.976 rows=428467 loops=1)
 -  Index Scan using idx_clause2_source_id on clause2 c  
(cost=0.00..18.18 rows=781 width=4) (actual time=0.013..0.013 rows=1 
loops=428467)

   Index Cond: (c.source_id = p.crawled_page_id)
 Total runtime: 7054.074 ms :-)
(6 rows)



Actually the plans are equal, so I suppose it depends on what were run 
first :). Slow query operates with data mostly on disk, while fast one 
with data in memory.


Best regards, Vitalii Tymchyshyn


Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Vitalii Tymchyshyn

23.03.11 09:30, Adarsh Sharma ???(??):

Thanks Chetan, here is the output of your updated query :


*explain  select distinct(p.crawled_page_id) from page_content p where 
NOT EXISTS (select 1 from  clause2 c where c.source_id = 
p.crawled_page_id);


*
  QUERY PLAN
---
 HashAggregate  (cost=1516749.47..1520576.06 rows=382659 width=8)
   -  Hash Anti Join  (cost=1294152.41..1515791.80 rows=383071 width=8)
 Hash Cond: (p.crawled_page_id = c.source_id)
 -  Seq Scan on page_content p  (cost=0.00..87132.17 
rows=428817 width=8)

 -  Hash  (cost=771182.96..771182.96 rows=31876196 width=4)
   -  Seq Scan on clause2 c  (cost=0.00..771182.96 
rows=31876196 width=4)

(6 rows)

And my explain analyze output is :

  QUERY PLAN

 HashAggregate  (cost=1516749.47..1520576.06 rows=382659 width=8) 
(actual time=5.181..56669.270 rows=72 loops=1)
   -  Hash Anti Join  (cost=1294152.41..1515791.80 rows=383071 
width=8) (actual time=45740.789..56665.816 rows=74 loops=1)

 Hash Cond: (p.crawled_page_id = c.source_id)
 -  Seq Scan on page_content p  (cost=0.00..87132.17 
rows=428817 width=8) (actual time=0.012..715.915 rows=428467 loops=1)
 -  Hash  (cost=771182.96..771182.96 rows=31876196 width=4) 
(actual time=45310.524..45310.524 rows=31853083 loops=1)
   -  Seq Scan on clause2 c  (cost=0.00..771182.96 
rows=31876196 width=4) (actual time=0.055..23408.884 rows=31853083 
loops=1)

 Total runtime: 56687.660 ms
(7 rows)

But Is there is any option to tune it further and one more thing 
output rows varies from 6 to 7.

You need an index on source_id to prevent seq scan, like the next:
CREATE INDEX idx_clause2_source_id
  ON clause2
  (source_id);*

*Best regards, Vitalii Tymchyshyn



Re: [PERFORM] Disabling nested loops - worst case performance

2011-03-18 Thread Vitalii Tymchyshyn

18.03.11 09:15, Anssi Kääriäinen написав(ла):

Hello list,

I am working on a Entity-Attribute-Value (EAV) database using 
PostgreSQL 8.4.7. The basic problem is that when joining multiple 
times different entities the planner thinks that there is vastly less 
rows to join than there is in reality and decides to use multiple 
nested loops for the join chain. This results in queries where when 
nested loops are enabled, query time is somewhere around 35 seconds, 
but with nested loops disabled, the performance is somewhere around 
100ms. I don't think there is much hope for getting better statistics, 
as EAV is just not statistics friendly. The values of an attribute 
depend on the type of the attribute, and different entities have 
different attributes defined. The planner has no idea of these 
correlations.


Hello.

If your queries work on single attribute, you can try adding partial 
indexes for different attributes. Note that in this case parameterized 
statements may prevent index usage, so check also with attribute id inlined.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...

2011-02-11 Thread Vitalii Tymchyshyn

11.02.11 11:29, Tobias Brox написав(ла):

2011/2/11 Віталій Тимчишинtiv...@gmail.com:

If the list is hard-coded, you can create partial index  on
account_transaction(account_id, created desc) where trans_type_id in ( ...
long, hard-coded list ...)

My idea as well, though it looks ugly and it would be a maintenance
head-ache (upgrading the index as new transaction types are added
would mean costly write locks on the table,

Create new one concurrently.

  and we can't rely on
manual processes to get it right ... we might need to set up scripts
to either upgrade the index or alert us if the index needs upgrading).

Yep. Another option could be to add query rewrite as

select  * from (
select * from account_transaction where trans_type_id =type1 and 
account_id=? order by created desc limit 25 union all
select * from account_transaction where trans_type_id =type2 and 
account_id=? order by created desc limit 25 union all

...
union all
select * from account_transaction where trans_type_id =typeN and 
account_id=? order by created desc limit 25

) a
order by created desc limit 25

This will allow to use three-column index in the way it can be used for 
such query. Yet if N is large query will look ugly. And I am not sure if 
optimizer is smart enough for not to fetch 25*N rows.



Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad query plan when the wrong data type is used

2011-02-09 Thread Vitalii Tymchyshyn

09.02.11 01:14, Dave Crooke написав(ла):
You will get the same behaviour from any database product where the 
query as written requires type coercion - the coercion has to go in 
the direction of the wider type. I have seen the exact same scenario 
with Oracle, and I view it as a problem with the way the query is 
written, not with the database server.


Whoever coded the application which is making this query presumably 
knows that the visa.id http://visa.id field is an integer type in 
the schema they designed, so why are they passing a float? Convert the 
4.0 to 4 on the application side instead, it's one function call or cast.
Actually the problem may be in layers, and the  problem may even be not 
noticed until it's late enough. As far as I remember from this list 
there are problems with column being integer and parameter prepared as 
bigint or number. Same for number vs double vs float.

As for me it would be great for optimizer to consider the next:
1) val1::narrow = val2::wide as (val1::narrow = val2::narrow and 
val2::narrow = val2::wide)
2) val1::narrow  val2::wide as (val1::narrow  val2::narrow and 
val1::wide  val2::wide)
3) val1::narrow  val2::wide as (val1::narrow + 1  val2::narrow and 
val1::wide  val2::wide)

Of course it should use additional check it this allows to use an index.
Surely, this is not an easy thing to implement, but as for me similar 
question are raised quite often in this list.


Best regards, Vitalii Tymchyshyn



Re: [PERFORM] getting the most of out multi-core systems for repeated complex SELECT statements

2011-02-07 Thread Vitalii Tymchyshyn

Hi, all

My small thoughts about parallelizing single query.
AFAIK in the cases where it is needed, there is usually one single 
operation that takes a lot of CPU, e.g. hashing or sorting. And this are 
usually tasks that has well known algorithms to parallelize.
The main problem, as for me, is thread safety. First of all, operations 
that are going to be parallelized, must be thread safe. Then functions 
and procedures they call must be thread safe too. So, a marker for a 
procedure must be introduced and all standard ones should be 
checked/fixed for parallel processing with marker set.
Then, one should not forget optimizer checks for when to introduce 
parallelizing. How should it be accounted in the query plan? Should it 
influence optimizer decisions (should it count CPU or wall time when 
optimizing query plan)?
Or can it simply be used by an operation when it can see it will benefit 
from it.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to best use 32 15k.7 300GB drives?

2011-02-04 Thread Vitalii Tymchyshyn

03.02.11 20:42, Robert Haas написав(ла):

2011/1/30 Віталій Тимчишинtiv...@gmail.com:

I was thinking if a table file could be deleted if it has no single live
row. And if this could be done by vacuum. In this case vacuum on table that
was fully updated recently could be almost as good as cluster - any scan
would skip such non-existing files really fast. Also almost no disk space
would be wasted.

VACUUM actually already does something along these lines.  If there
are 1 or any larger number of entirely-free pages at the end of a
table, VACUUM will truncate them away.  In the degenerate case where
ALL pages are entirely-free, this results in zeroing out the file.

The problem with this is that it rarely does much.  Consider a table
with 1,000,000 pages, 50% of which contain live rows.  On average, how
many pages will this algorithm truncate away?  Answer: if the pages
containing live rows are randomly distributed, approximately one.
Yes, but take into account operations on a (by different reasons) 
clustered tables, like removing archived data (yes I know, this is best 
done with partitioning, but one must still go to a point when he will 
decide to use partitioning :) ).

Your idea of having a set of heaps rather than a single heap is an
interesting one, but it's pretty much catering to the very specific
case of a full-table update.  I think the code changes needed would be
far too invasive to seriously contemplate doing it just for that one
case - although it is an important case that I would like to see us
improve.
Why do you expect such a invasive code changes? I know little about 
postgresql code layering, but what I propose (with changing delete to 
truncate) is:

1) Leave tuple addressing as it is now
2) Allow truncated files, treating non-existing part as if it contained 
not used tuples

3) Make vacuum truncate file if it has not used tuples at the end.

The only (relatively) tricky thing I can see is synchronizing truncation 
with parallel ongoing scan.


Best regards, Vitalii Tymchyshyn



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-04 Thread Vitalii Tymchyshyn

04.02.11 16:33, Kenneth Marshall написав(ла):


In addition, the streaming ANALYZE can provide better statistics at
any time during the load and it will be complete immediately. As far
as passing the entire table through the ANALYZE process, a simple
counter can be used to only send the required samples based on the
statistics target. Where this would seem to help the most is in
temporary tables which currently do not work with autovacuum but it
would streamline their use for more complicated queries that need
an analyze to perform well.

Actually for me the main con with streaming analyze is that it adds 
significant CPU burden to already not too fast load process. Especially 
if it's automatically done for any load operation performed (and I can't 
see how it can be enabled on some threshold).
And you can't start after some threshold of data passed by since you may 
loose significant information (like minimal values).


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Vitalii Tymchyshyn

02.02.11 20:32, Robert Haas написав(ла):


Yeah.  Any kind of bulk load into an empty table can be a problem,
even if it's not temporary.  When you load a bunch of data and then
immediately plan a query against it, autoanalyze hasn't had a chance
to do its thing yet, so sometimes you get a lousy plan.


May be introducing something like 'AutoAnalyze' threshold will help? I 
mean that any insert/update/delete statement that changes more then x% 
of table (and no less then y records) must do analyze right after it was 
finished.

Defaults like x=50 y=1 should be quite good as for me.

Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Vitalii Tymchyshyn

03.02.11 17:31, Robert Haas написав(ла):



May be introducing something like 'AutoAnalyze' threshold will help? I mean
that any insert/update/delete statement that changes more then x% of table
(and no less then y records) must do analyze right after it was finished.
Defaults like x=50 y=1 should be quite good as for me.

That would actually be a pessimization for many real world cases.  Consider:

COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
SELECT

If all the copies are ~ same in size and large this will make it:

COPY
ANALYZE
COPY
ANALYZE
COPY
COPY
ANALYZE
COPY
COPY
COPY
COPY
ANALYZE
COPY
COPY
COPY
COPY
COPY
SELECT

instead of

COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
COPY
ANALYZE (manual, if one is clever enough)
SELECT

So, yes this will add 3 more analyze, but
1) Analyze is pretty cheap comparing to large data loading. I'd say this 
would add few percent of burden. And NOT doing analyze manually before 
select can raise select costs orders of magnitude.
2) How often in real world a single table is loaded in many COPY 
statements? (I don't say it's not often, I really don't know). At least 
for restore it is not the case, is not it?
3) default thresholds are things to discuss. You can make x=90 or x=200 
(latter will make it run only for massive load/insert operations). You 
can even make it disabled by default for people to test. Or enable by 
default for temp tables only (and have two sets of thresholds)
4) As most other settings, this threshold can be changed on up to 
per-query basis.


P.S. I would also like to have index analyze as part of any create index 
process.


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] queries with lots of UNIONed relations

2011-01-14 Thread Vitalii Tymchyshyn

14.01.11 00:26, Tom Lane написав(ла):

Robert Haasrobertmh...@gmail.com  writes:

On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelsonjnelson+pg...@jamponi.net  wrote:

I still think that having UNION do de-duplication of each contributory
relation is a beneficial thing to consider -- especially if postgresql
thinks the uniqueness is not very high.

This might be worth a TODO.

I don't believe there is any case where hashing each individual relation
is a win compared to hashing them all together.  If the optimizer were
smart enough to be considering the situation as a whole, it would always
do the latter.


How about cases when individual relations are already sorted? This will 
mean that they can be deduplicated fast and in streaming manner. Even 
partial sort order may help because you will need to deduplicate only 
groups with equal sorted fields, and this will take much less memory and 
be much more streaming. And if all individual deduplications are 
streaming and are sorted in one way - you can simply do a merge on top.


Best regards, Vitalii Tymchyshyn.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance under contention

2010-11-24 Thread Vitalii Tymchyshyn

24.11.10 02:11, Craig Ringer написав(ла):

On 11/22/2010 11:38 PM, Ivan Voras wrote:

On 11/22/10 16:26, Kevin Grittner wrote:

Ivan Vorasivo...@freebsd.org wrote:

On 11/22/10 02:47, Kevin Grittner wrote:

Ivan Voras wrote:


After 16 clients (which is still good since there are only 12
real cores in the system), the performance drops sharply


Yet another data point to confirm the importance of connection
pooling. :-)


I agree, connection pooling will get rid of the symptom. But not
the underlying problem. I'm not saying that having 1000s of
connections to the database is a particularly good design, only
that there shouldn't be a sharp decline in performance when it
does happen. Ideally, the performance should remain the same as it
was at its peek.


Well, I suggested that we add an admission control[1] mechanism,


It looks like a hack (and one which is already implemented by connection
pool software); the underlying problem should be addressed.


My (poor) understanding is that addressing the underlying problem 
would require a massive restructure of postgresql to separate 
connection and session state from executor and backend. Idle 
connections wouldn't require a backend to sit around unused but 
participating in all-backends synchronization and signalling. Active 
connections over a configured maximum concurrency limit would queue 
for access to a backend rather than fighting it out for resources at 
the OS level.


The trouble is that this would be an *enormous* rewrite of the 
codebase, and would still only solve part of the problem. See the 
prior discussion on in-server connection pooling and admission control.

Hello.

IMHO the main problem is not a backend sitting and doing nothing, but 
multiple backends trying to do their work. So, as for me, the simplest 
option that will make most people happy would be to have a limit 
(waitable semaphore) on backends actively executing the query. Such a 
limit can even be automatically detected based on number of CPUs 
(simple) and spindels (not sure if simple, but some default can be 
used). Idle (or waiting for a lock) backend consumes little resources. 
If one want to reduce resource usage for such a backends, he can 
introduce external pooling, but such a simple limit would make me happy 
(e.g. having max_active_connections=1000, max_active_queries=20).
The main Q here, is how much resources can take a backend that is 
waiting for a lock. Is locking done at the query start? Or it may go 
into wait while consumed much of work_mem. In the second case, the limit 
won't be work_mem limit, but will still prevent much contention.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Vitalii Tymchyshyn

I'd say there are two Qs here:

1) Modify costs based on information on how much of the table is in 
cache. It would be great  if this can be done, but I'd prefer to have it 
as admin knobs (because of plan stability). May be both admin and 
automatic ways can be followed with some parallel (disableable) process 
modify knobs on admin behalf. In this case different strategies to 
automatically modify knobs can be applied.


2) Modify costs for part of table retrieval. Then you need to define 
part. Current ways are partitioning and partial indexes. Some similar 
to partial index thing may be created, that has only where clause and 
no data. But has statistics and knobs (and may be personal bufferspace 
if they are introduced). I don't like to gather data about last X 
percents or like, because it works only in clustering and it's hard for 
optimizer to decide if it will be enough to scan only this percents for 
given query.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Vitalii Tymchyshyn

12.11.10 12:56, Cédric Villemain написав(ла):

I supposed it was an answer to my mail but not sure... please keep
CC'ed people, it is easier to follow threads (at least for me)
   

OK

2010/11/12 Vitalii Tymchyshyntiv...@gmail.com:
   

I'd say there are two Qs here:

1) Modify costs based on information on how much of the table is in cache.
It would be great  if this can be done, but I'd prefer to have it as admin
knobs (because of plan stability). May be both admin and automatic ways can
be followed with some parallel (disableable) process modify knobs on admin
behalf. In this case different strategies to automatically modify knobs can
be applied.
 

OS cache is usualy stable enough to keep your plans stable too, I think.
   
Not if it is on edge. There are always edge cases where data fluctuates 
near some threshold.
   

2) Modify costs for part of table retrieval. Then you need to define part.
Current ways are partitioning and partial indexes. Some similar to partial
index thing may be created, that has only where clause and no data. But
has statistics and knobs (and may be personal bufferspace if they are
introduced). I don't like to gather data about last X percents or like,
because it works only in clustering and it's hard for optimizer to decide if
it will be enough to scan only this percents for given query.
 

Modifying random_page_cost and sequential_page_cost thanks to
statistics about cached blocks can be improved if we know the
distribution.

It does not mean : we know we have last 15% in cache, and we are goign
to request those 15%.
   


You mean *_cost for the whole table, don't you? That is case (1) for me.
Case (2) is when different cost values are selected based on what 
portion of table is requested in the query. E.g. when we have data for 
the whole day in one table, data for the last hour is cached and all the 
other data is not. Optimizer then may use different *_cost for query 
that requires all the data and for query that requires only last hour 
data. But, as I've said, that is much more complex task then (1).


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] MVCC performance issue

2010-11-12 Thread Vitalii Tymchyshyn

12.11.10 15:47, Kyriacos Kyriacou написав(ла):

PROBLEM DECRIPTION
--
As an example, consider updating the live balance
of a customer for each phone call where the entire customer record has
to be duplicated again and again upon each call just for modifying a
numeric value!
   
Have you considered splitting customer record into two tables with 
mostly read-only data and with data that is updated often? Such 1-1 
relationship can make a huge difference to performance in your case. You 
can even try to simulate old schema by using an updateable view.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Vitalii Tymchyshyn

04.11.10 16:31, Nick Matheson написав(ла):

Heikki-


Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY.

Thanks for the suggestion. A preliminary test shows an improvement 
closer to our expected 35 MB/s.


Are you familiar with any Java libraries for decoding the COPY format? 
The spec is clear and we could clearly write our own, but figured I 
would ask. ;)
JDBC driver has some COPY support, but I don't remember details. You'd 
better ask in JDBC list.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Select count(*), the sequel

2010-10-18 Thread Vitalii Tymchyshyn

16.10.10 19:51, Mladen Gogala написав(ла):
There was some doubt as for the speed of doing the select count(*) in 
PostgreSQL and Oracle.
To that end, I copied the most part of the Oracle table I used before 
to Postgres. Although the copy
wasn't complete, the resulting table is already significantly larger 
than the table it was copied from. The result still shows that Oracle 
is significantly faster:


Hello.

Did you vacuum postgresql DB before the count(*). I ask this because 
(unless table was created  loaded in same transaction) on the first 
scan, postgresql has to write hint bits to the whole table. Second scan 
may be way faster.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Vitalii Tymchyshyn

12.10.10 14:44, Craig Ringer написав(ла):



in the case where you are doing a count(*) where query and the where is
on an indexed column, could the search just look at the index + the
visibility mapping rather than doing an sequential search through the
table?


Nope, because the visibility map, which is IIRC only one bit per page, 
doesn't record how many tuples there are on the page, or enough 
information about them to determine how many of them are visible to 
the current transaction*.
I'd say it can tell you that your may not recheck given tuple, can't it? 
You still have to count all index tuples and recheck the ones that are 
uncertain. Does it work in this way? This can help a lot for wide tuples 
in table, but with narrow index and mostly read-only data.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Vitalii Tymchyshyn

12.10.10 21:58, Tom Lane написав(ла):


I'm less than convinced that that approach will result in a significant
win.  It's certainly not going to do anything to convert COUNT(*) into
an O(1) operation, which frankly is what the complainants are expecting.
There's basically no hope of solving the PR problem without somehow
turning COUNT(*) into a materialized-view reference.  We've discussed
that in the past, and know how to do it in principle, but the complexity
and distributed overhead are daunting.

   

I've though about aggregate indexes, something like
create index index_name on table_name(count(*) group by column1, column2);
OR
create index index_name on table_name(count(*));
for table-wide count

To make it usable one would need:
1) Allow third aggregate function SMERGE that can merge one aggregate 
state to another
2) The index should be regular index (e.g. btree) on column1, column2 
that for each pair has page list to which it's data may belong (in 
past/current running transactions), and aggregate state for each page 
that were frozen previously
When index is used, it can use precalculated values for pages with all 
tuples vacuumed (I suspect this is information from visibility map) and 
should do regular calculation for all non-frozen pages with visibility 
checks and everything what's needed.
When vacuum processes the page, it should (in sync or async way) 
calculate aggregate values for the page.


IMHO Such an indexes would make materialized views/triggers/high level 
caches unneeded in most cases.


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Vitalii Tymchyshyn

11.10.10 20:46, Craig James написав(ла):


First of all, it's not true. There are plenty of applications that 
need an exact answer. Second, even if it is only 1%, that means it's 
1% of the queries, not 1% of people. Sooner or later a large fraction 
of developers will run into this. It's probably been the most-asked 
question I've seen on this forum in the four years I've been here. 
It's a real problem, and it needs a real solution.


I know it's a hard problem to solve, but can we stop hinting that 
those of us who have this problem are somehow being dense?



BTW: There is a lot of talk about MVCC, but is next solution possible:
1) Create a page information map that for each page in the table will 
tell you how may rows are within and if any write (either successful or 
not) were done to this page. This even can be two maps to make second 
one really small (a bit per page) - so that it could be most time in-memory.
2) When you need to to count(*) or index check - first check if there 
were no writes to the page. If not - you can use count information from 
page info/index data without going to the page itself
3) Let vacuum clear the bit after frozing all the tuples in the page (am 
I using terminology correctly?).


In this case all read-only (archive) data will be this bit off and 
index/count(*) will be really fast.

Am I missing something?

Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Vitalii Tymchyshyn

12.10.10 11:14, Craig Ringer написав(ла):

On 10/12/2010 03:56 PM, Vitalii Tymchyshyn wrote:


BTW: There is a lot of talk about MVCC, but is next solution possible:
1) Create a page information map that for each page in the table will
tell you how may rows are within and if any write (either successful or
not) were done to this page. This even can be two maps to make second
one really small (a bit per page) - so that it could be most time
in-memory.
2) When you need to to count(*) or index check - first check if there
were no writes to the page. If not - you can use count information from
page info/index data without going to the page itself
3) Let vacuum clear the bit after frozing all the tuples in the page (am
I using terminology correctly?).


Part of this already exists. It's called the visibility map, and is 
present in 8.4 and above. It's not currently used for queries, but can 
potentially be used to aid some kinds of query.


http://www.postgresql.org/docs/8.4/static/storage-vm.html


In this case all read-only (archive) data will be this bit off and
index/count(*) will be really fast.


A count with any joins or filter criteria would still have to scan all 
pages with visible tuples in them. 
If one don't use parittioning. With proper partitioning, filter can 
simply select a partitions.


Also filtering can be mapped on the index lookup. And if one could join 
index hash and visibility map, much like two indexes can be bit joined 
now, count can be really fast for all but non-frozen tuples.
So the visibility map helps speed up scanning of bloated tables, but 
doesn't provide a magical fast count except in the utterly trivial 
select count(*) from tablename; case, and can probably only be used 
for accurate results when there are no read/write transactions 
currently open. 
Why so? You simply has to recount the pages that are marked dirty in 
usual way. But count problem usually occurs when there are a lot of 
archive data (you need to count over 100K records) that is not modified.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Vitalii Tymchyshyn

28.07.10 04:56, Tom Lane написав(ла):


I'm not asserting it's true, just suggesting it's entirely possible.
Other than the fork() cost itself and whatever authentication activity
there might be, practically all the startup cost of a new backend can be
seen as cache-populating activities.  You'd have to redo all of that,
*plus* pay the costs of getting rid of the previous cache entries.
Maybe the latter costs less than a fork(), or maybe not.  fork() is
pretty cheap on modern Unixen.

   
Actually as for me, the problem is that one can't raise number of 
database connections high without overloading CPU/memory/disk, so 
external pooling is needed. If postgresql had something like 
max_active_queries setting that limit number of connections that are not 
in IDLE [in transaction] state, one could raise max connections high 
(and I don't think idle process itself has much overhead) and limit 
max_active_queries to get maximum performance and won't use external 
pooling. Of course this won't help if the numbers are really high, but 
could work out the most common cases.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Big difference in time returned by EXPLAIN ANALYZE SELECT ... AND SELECT ...

2010-07-28 Thread Vitalii Tymchyshyn

27.07.10 02:03, Lew написав(ла):

Piotr Gasidło wrote:

EXPLAIN ANALYZE SELECT ...
Total runtime: 4.782 ms
Time: 25,970 ms


Strangely, the runtime is shown with a period for the separator, though.

One value is calculated on server by EXPLAIN ANALYZE command, another is 
calculated by psql itself.


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Big difference in time returned by EXPLAIN ANALYZE SELECT ... AND SELECT ...

2010-07-26 Thread Vitalii Tymchyshyn

26.07.10 12:15, Craig Ringer написав(ла):

On 26/07/10 16:35, Piotr Gasidło wrote:
   

Hello,

I've found strange problem in my database (8.4.4, but also 9.0beta3,
default postgresql.conf, shared_buffers raised to 256MB).

EXPLAIN ANALYZE SELECT ...
Total runtime: 4.782 ms
Time: 25,970 ms

SELECT ...
...
(21 rows)

Time: 23,042 ms

Test done in psql connected by socket to server (same host, using
\timing to get runtime).

Does big difference in Total runtime and Time is normal?
 

Given that EXPLAIN ANALYZE doesn't transfer large rowsets to the client,
it can't really be time taken to transfer the data, which is the usual
difference between 'explain analyze' timings and psql client-side timings.

Given that, I'm wondering if the difference in this case is planning
time. I can't really imagine the query planner taking 20 seconds (!!) to
run, though, no matter how horrifyingly complicated the query and table
structure were, unless there was something going wrong.
   

Actually it's 20ms, so I suspect your point about planning time is correct.
Piotr: You can try preparing your statement and then analyzing execute 
time to check if this is planning time.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Big field, limiting and ordering

2010-07-19 Thread Vitalii Tymchyshyn

19.07.10 18:09, Ivan Voras написав(ла):

Hello,

I don't think this is generally solvable but maybe it is so here goes.
The original situation was this:

SELECT something, big_field, complex_function(big_field), rank FROM t1
UNION ALL SELECT something, big_field, complex_function(big_field), rank
from t2 ORDER BY rank LIMIT small_number;

This query first fetches all big_field datums and does all
complex_function() calculations on them, then orders then by rank, even
though I actually need only small_number of records. There are two
problems here: first, selecting for all big_field values requires a lot
of memory, which is unacceptable, and then, running complex_function()
on all of them takes too long.

I did get rid of unnecessary complex_function() calculations by nesting
queries like:

SELECT something, big_field, complex_function(big_field), rank FROM
(SELECT original_query_without_complex_function_but_with_big_field ORDER
BY rank LIMIT small_number);

but this still leaves gathering all the big_field datum from the
original query. I cannot pull big_field out from this subquery because
it comes from UNION of tables.

Any suggestions?
   

You can do the next:

SELECT something, big_field, complex_function(big_field), rank FROM
(SELECT * from
(
(SELECT something, big_field, complex_function(big_field), rank FROM t1 order 
by rank limit small_number)
UNION ALL (SELECT something, big_field, complex_function(big_field), rank
from t2 ORDER BY rank LIMIT small_number)
) a

 ORDER
BY rank LIMIT small_number) b;

So, you take small_number records from each table, then select small_number 
best records from resulting set, then do the calculation.

Best regards, Vitalii Tymchyshyn



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance