from:"Stefan"

[PERFORM] WAL still kept in pg_xlog even long after heavy workload is done

2017-10-27 Thread Stefan Petrea

Hello,

We're encountering some problems with WAL growth in production with
PostgreSQL 9.6.3 and 9.6.2. From what I know a WAL file can either be
recycled(and would be reused) or deleted.
We'd like to have better control over the amount of WAL that is kept around.
There were a few occasions where we had to resize partitions because
pg_xlog grew as much as it did.

According to the docs [1] there are some parameters in GUC (postgresql.conf)
about this.
The parameters I've been able to identify are the following:

* wal_keep_segments
* max_wal_size
* min_wal_size

Our WAL grows a lot around the time of our product upgrades (that is,
when we upgrade to a new version of our database, so not a Pg upgrade,
just a newer version of our db schema, plpgsql code etc).
As part of this upgrade, we add new columns or have some large UPDATEs
on tables as big as 300M (but in one case we also have one with 1.5B rows).

I am seeing the following int he docs [3]

min_wal_size (integer)
As long as WAL disk usage stays below this setting, old WAL files are
always recycled for future use at a checkpoint, rather than removed.
This can be used to ensure that enough WAL space is reserved to handle
spikes in WAL usage, for example when running large batch jobs. The default
is 80 MB. This parameter can only be set in the postgresql.conf file or
on the server command line.

This sounds very familiar because, that's essentially what we're doing. There
are some large jobs that cause a lot of workload and changes and generate a lot
of WAL.

So far, the way I interpret this is min_wal_size is the amount of WAL
recycled (that is kept around to be reused) and max_wal_size is the
total amount of WAL allowed to be kept on disk.

I would also like to interpret the default values of min_wal_size and
max_wal_size.
So if I run the following query:

psql -c "select name, setting from pg_settings where name like '%wal_size';"

I get the following:

max_wal_size|2097152
min_wal_size|1048576

Do these two values look ok?

Both these values were generated by pgtune [4], but it seems like pgtune
thinks they're expressed by default in KB.
Looking at the PostgreSQL code, it seems to me that these two are
expressed in MB, at least that's what I understand when I see
GUC_UNIT_MB in the source code [6].

So maybe the pgtune fork we're using has a bug in the sense that it
produces an incorrect value for those two parameters? (should be in MB
but is expressed in KB, therefore much higher than what it should be).

Another question is, how can I use any of the checkpoint settings
to control the WAL that is kept around?

* checkpoint_timeout
* checkpoint_completion_target
* checkpoint_flush_after
* checkpoint_warning

I actually tried something with these settings on a test environment.
I've used the following settings:

checkpoint_timeout = 40s
min_wal_size = 600MB
max_wal_size = 900MB

Then I've created a db named x1 and ran this on it four or five times.

pgbench -i -s 70 x1

The pg_xlog directory grew to 2.2G and after a few minutes, it decreased to 2.0G
After about 40 minutes it decreased to 1.4G and it's not going any lower.
I was expecting pg_xlog's size to be 600MB after the first WAL removal had run.
Should I expect that the size will eventually drop to 600MB or will it just sit
there at 1.4G?

Other thoughts:

I have looked a bit at Pg internals too, I'm seeing four functions
there that are responsible for removing WAL: XLogArchiveIsReady,
RemoveXlogFile, RemoveOldXlogFiles, XLOGfileslop.
All of these belong to /src/backend/access/transam/xlog.c

The only place in the code that seems to take a decision about how much
WAL to recycle and how much to remove is the function XLOGfileslop [2].

It seems like XLOGfileslop is an estimate for the number of WAL to keep
around(recycled WAL). Both max_wal_size and min_wal_size are used inside
XLOGfileslop.

As far as checkpoint_* GUC settings go, they seem to be involved as well.
So far, the only thing I know about checkpoints is that between
checkpoints, many WAL are created. The amount of WAL between checkpoints
can vary. I don't have a good understanding about the interplay between
checkpoints and WAL.

I'd be grateful for any thoughts on how to improve this, and better control
the amount of WAL kept in pg_xlog.

Thank you,
Stefan

[1] https://www.postgresql.org/docs/9.6/static/wal-configuration.html
[2]
https://github.com/postgres/postgres/blob/0c5803b450e0cc29b3527df3f352e6f18a038cc6/src/backend/access/transam/xlog.c#L2258
[3]
https://www.postgresql.org/docs/9.6/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS
[4] https://github.com/kmatt/pgtune
[5] https://github.com/kmatt/pgtune/blob/master/pgtune#L560
[6]
https://github.com/postgres/postgres/blob/f49842d1ee31b976c681322f76025d7732e860f3/src/backend/utils/misc/guc.c#L2268

Stefan Petrea
System Engineer

stefan.pet...@tangoe.com

Re: [PERFORM] anti-join with small table via text/varchar cannot estimate rows correctly

2017-03-01 Thread Stefan Andreatta


On 02.03.2017 02:06, Tom Lane wrote:

Stefan Andreatta <s.andrea...@synedra.com> writes:

The same anti-join using the text fields, however estimates just 1
resulting row, while there are still of course 9,999 of them:
=# explain analyze
   select tmp_san_1.id
   from tmp_san_1
 left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text
   where tmp_san_2.id is null;

That is not an anti-join.  To make it one, you have to constrain the RHS
join column to be IS NULL, not some random other column.  Note the join
type isn't getting shown as Anti:


   Hash Left Join  (cost=1.02..192.53 rows=1 width=4) (actual time=0.020..3.091 
rows= loops=1)

As written, the query could return some rows that weren't actually
antijoin rows, ie tmp_san_1.text *did* have a match in tmp_san_2,
but that row chanced to have a null value of id.

Possibly the planner could be smarter about estimating for this case,
but it doesn't look much like a typical use-case to me.

regards, tom lane


Thanks a lot! Right, my problem had nothing to do with the type of the 
join field, but with the selection of the proper field for the 
NULL-condition.


So, even a join on the id field is badly estimated if checked on the 
text field:


=# EXPLAIN ANALYZE
 SELECT tmp_san_1.id
 FROM tmp_san_1
   LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id
 WHERE (tmp_san_2.text IS NULL);
QUERY PLAN
---
 Hash Left Join  (cost=1.02..192.53 rows=1 width=4) (actual 
time=0.019..2.939 rows= loops=1)

   Hash Cond: (tmp_san_1.id = tmp_san_2.id)
   Filter: (tmp_san_2.text IS NULL)
   Rows Removed by Filter: 1
   ->  Seq Scan on tmp_san_1  (cost=0.00..154.00 rows=1 width=4) 
(actual time=0.007..1.003 rows=1 loops=1)
   ->  Hash  (cost=1.01..1.01 rows=1 width=6) (actual time=0.004..0.004 
rows=1 loops=1)

 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tmp_san_2  (cost=0.00..1.01 rows=1 width=6) 
(actual time=0.001..0.002 rows=1 loops=1)

 Planning time: 0.062 ms
 Execution time: 3.381 ms
(10 rows)


... but if the join and the check refer to the same field everything is 
fine:


=# EXPLAIN ANALYZE
 SELECT tmp_san_1.id
 FROM tmp_san_1
   LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id
 WHERE (tmp_san_2.id IS NULL);
QUERY PLAN
---
 Hash Anti Join  (cost=1.02..281.26 rows= width=4) (actual 
time=0.018..2.672 rows= loops=1)

   Hash Cond: (tmp_san_1.id = tmp_san_2.id)
   ->  Seq Scan on tmp_san_1  (cost=0.00..154.00 rows=1 width=4) 
(actual time=0.007..0.962 rows=1 loops=1)
   ->  Hash  (cost=1.01..1.01 rows=1 width=4) (actual time=0.003..0.003 
rows=1 loops=1)

 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tmp_san_2  (cost=0.00..1.01 rows=1 width=4) 
(actual time=0.001..0.001 rows=1 loops=1)

 Planning time: 0.051 ms
 Execution time: 3.164 ms
(8 rows)


It get's more interesting again, if the text field really could be NULL 
and I wanted to include those rows. If I just include "OR tmp_san_2.text 
IS NULL" estimates are off again:


=# EXPLAIN ANALYZE
 SELECT tmp_san_1.id
 FROM tmp_san_1
   LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id
 WHERE (tmp_san_2.id IS NULL OR tmp_san_2.text IS NULL);
QUERY PLAN
---
 Hash Left Join  (cost=1.02..192.53 rows=1 width=4) (actual 
time=0.019..2.984 rows= loops=1)

   Hash Cond: (tmp_san_1.id = tmp_san_2.id)
   Filter: ((tmp_san_2.id IS NULL) OR (tmp_san_2.text IS NULL))
   Rows Removed by Filter: 1
   ->  Seq Scan on tmp_san_1  (cost=0.00..154.00 rows=1 width=4) 
(actual time=0.008..1.024 rows=1 loops=1)
   ->  Hash  (cost=1.01..1.01 rows=1 width=6) (actual time=0.004..0.004 
rows=1 loops=1)

 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tmp_san_2  (cost=0.00..1.01 rows=1 width=6) 
(actual time=0.001..0.002 rows=1 loops=1)

 Planning time: 0.088 ms
 Execution time: 3.508 ms
(10 rows)


Instead, it seems, I have to move this condition (inverted) into the 
join clause for the planner to make correct estimates again:


=# EXPLAIN ANALYZE
 SELECT tmp_san_1.id
 FROM tmp_san_1
   LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id AND 
tmp_san_2.text IS NOT NULL

 WHERE (tmp_san_2.id IS NULL);
QUERY PLAN

[PERFORM] anti-join with small table via text/varchar cannot estimate rows correctly

2017-03-01 Thread Stefan Andreatta


Hello,

I have encountered a strange problem when doing an anti-join with a very 
small table via a varchar or text field as opposed to an integer field. 
Postgres version is 9.5.3


I did some experiments to extract the problem in a simple form. FIrst 
generate two tables with a series of numbers - once as integers once as 
text. The first table has 10,000 rows the second table just one:


=# select generate_series(1, 1) as id, generate_series(1,1)::text as 
text into table tmp_san_1;
SELECT 1
=# select generate_series(1, 1) as id, generate_series(1,1)::text as text into 
table tmp_san_2;
SELECT 1

=# analyze tmp_san_1;
ANALYZE
=# analyze tmp_san_2;
ANALYZE

=# \d tmp_san_*
   Table "public.tmp_san_1"
 Column |  Type   | Modifiers
+-+---
 id | integer |
 text   | text|

   Table "public.tmp_san_2"
 Column |  Type   | Modifiers
+-+---
 id | integer |
 text   | text|



Now I do an anti-join between the two tables via the id field (integer). 
The number of resulting rows are estimated correctly as 9,999:



=# explain analyze
 select tmp_san_1.id
 from tmp_san_1
   left join tmp_san_2 on tmp_san_1.id = tmp_san_2.id
 where tmp_san_2.id is null;
 
   QUERY PLAN

---
 Hash Anti Join  (cost=1.02..281.26 rows= width=4) (actual 
time=0.019..2.743 rows= loops=1)
   Hash Cond: (tmp_san_1.id = tmp_san_2.id)
   ->  Seq Scan on tmp_san_1  (cost=0.00..154.00 rows=1 width=4) (actual 
time=0.007..1.023 rows=1 loops=1)
   ->  Hash  (cost=1.01..1.01 rows=1 width=4) (actual time=0.004..0.004 rows=1 
loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tmp_san_2  (cost=0.00..1.01 rows=1 width=4) (actual 
time=0.002..0.002 rows=1 loops=1)
 Planning time: 0.138 ms
 Execution time: 3.218 ms
(8 rows)


The same anti-join using the text fields, however estimates just 1 
resulting row, while there are still of course 9,999 of them:


=# explain analyze
 select tmp_san_1.id
 from tmp_san_1
   left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text
 where tmp_san_2.id is null;

QUERY PLAN
---
 Hash Left Join  (cost=1.02..192.53 rows=1 width=4) (actual time=0.020..3.091 
rows= loops=1)
   Hash Cond: (tmp_san_1.text = tmp_san_2.text)
   Filter: (tmp_san_2.id IS NULL)
   Rows Removed by Filter: 1
   ->  Seq Scan on tmp_san_1  (cost=0.00..154.00 rows=1 width=8) (actual 
time=0.008..0.983 rows=1 loops=1)
   ->  Hash  (cost=1.01..1.01 rows=1 width=6) (actual time=0.004..0.004 rows=1 
loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tmp_san_2  (cost=0.00..1.01 rows=1 width=6) (actual 
time=0.002..0.002 rows=1 loops=1)
 Planning time: 0.173 ms
 Execution time: 3.546 ms
(10 rows)


I cannot explain that behavior and much less think of a fix or 
workaround. Unfortunately my real-world example has to use varchar for 
the join.


Thanks for any help,
Stefan

[PERFORM] Re: [PERFORM] Re: Query > 1000× slowdown after adding datetime comparison

2015-08-31 Thread Stefan Keller

2015-08-31 21:46 GMT+02:00 twoflower  wrote:
> I created a new boolean column and filled it for every row in DOCUMENT with
> *(doc.date_last_updated >= date(now() - '171:00:00'::interval))*, reanalyzed
> ...

... and you've put an index on that new boolean column (say "updated")?
CREATE INDEX index_name ON some_table (boolean_field);
or tried a conditional index like
CREATE INDEX index_name ON some_table (some_field) WHERE boolean_field;

-S.


2015-08-31 21:46 GMT+02:00 twoflower :
> David G Johnston wrote
>> What happens if you pre-compute the date condition and hard code it?
>
> I created a new boolean column and filled it for every row in DOCUMENT with
> *(doc.date_last_updated >= date(now() - '171:00:00'::interval))*, reanalyzed
> the table and modified the query to just compare this column to TRUE. I
> expected this to be very fast, considering that a (to me, anyway) similar
> query also containing a constant value comparison finishes immediately.
> However, the query is running now for 4 minutes already. That's really
> interesting.
>
>
>
> --
> View this message in context: 
> http://postgresql.nabble.com/Query-1-000-000-slowdown-after-adding-datetime-comparison-tp5864045p5864088.html
> Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Re: [PERFORM] Re: Query > 1000× slowdown after adding datetime comparison

2015-08-31 Thread Stefan Keller

So, if I'm understanding you correctly, we're talking solely about
following clause in the query you gave initially:

WHERE doc.date_last_updated >= date(now() - '171:00:00'::interval)
which initially was
WHERE documenttype = 4
and now is being replaced by a temporary (I'd say derived) column
WHERE updated
?

In any case - I have to go - but run http://explain.depesz.com/ and
give a weblink to the explain plans of your queries.

-S.


2015-08-31 22:30 GMT+02:00 twoflower :
> I did not. I wanted to compare this query to the one I tried before, having
> *documenttype = 4* as the sole condition. That one was very fast and the
> *documenttype* was not indexed either.
>
> But this query, using the new temporary column, still runs, after 48
> minutes...
>
>
>
> --
> View this message in context: 
> http://postgresql.nabble.com/Query-1-000-000-slowdown-after-adding-datetime-comparison-tp5864045p5864101.html
> Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Strange performance problem with query

2014-09-16 Thread Van Der Berg, Stefan

Hi Kevin,

Thanks for the advice.

I opted for setting the random_page_cost a bit lower, as that made the 
most sense in the context of the current setup where there is quite a 
high cache hit ratio. Is 97% high enough?:

=# SELECT
 'cache hit rate' AS name,
  sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS 
ratio
FROM pg_statio_user_tables;
   name  | ratio
+
  cache hit rate | 0.97344836172381212996

When I set the random_page_cost down from 4 to 2, the query plan changes 
to the faster one.

Kind Regards
Stefan

Cell : 072-380-1479
Desk : 087-577-7241

On 2014/09/15 03:25 PM, Kevin Grittner wrote:
 Van Der Berg, Stefan svanderb...@fnb.co.za wrote:

 I get a similar plan selected on the original query if I set
 enable_seqscan to off. I much prefer the second result.
 My questions are:
 1. Why is this happening?
 Your cost factors don't accurately model actual costs.

 2. How can I encourage the behavior of the second query without
 changing the original query?
 You didn't give enough information to really give solid advice, but
 when people see what you are seeing, some common tuning needed is:

 Set shared_buffers to about 25% of system RAM or 8GB, whichever is
 lower.

 Set effective_cache_size to 50% to 75% of system RAM.

 Set work_mem to about 25% of system RAM divided by max_connections.

 If you have a high cache hit ratio (which you apparently do) reduce
 random_page_cost, possibly to something near or equal to
 seq_page_cost.

 Increase cpu_tuple_cost, perhaps to 0.03.

 You might want to play with the above, and if you still have a
 problem, read this page and post with more detail:

 http://wiki.postgresql.org/wiki/SlowQueryQuestions

 Is there some column level setting I can set?
 The statistics looked pretty accurate, so that shouldn't be
 necessary.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
To read FirstRand Bank's Disclaimer for this email click on the following 
address or copy into your Internet browser: 
https://www.fnb.co.za/disclaimer.html 

If you are unable to access the Disclaimer, send a blank e-mail to
firstrandbankdisclai...@fnb.co.za and we will send you a copy of the Disclaimer.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Strange performance problem with query

2014-09-15 Thread Van Der Berg, Stefan

Hi All,

Please see the output from the following query analysis :
=# explain analyze select count(1) from jbpmprocess.jbpm_taskinstance ti 
join jbpmprocess.jbpm_task task on (ti.task_ = task.id_ ) join 
jbpmprocess.jbpm_processinstance pi on ti.procinst_ = pi.id_ where 
ti.isopen_ = true;
QUERY PLAN
--
  Aggregate  (cost=47372.04..47372.05 rows=1 width=0) (actual 
time=647.070..647.071 rows=1 loops=1)
-  Hash Join  (cost=44806.99..47336.72 rows=14127 width=0) (actual 
time=605.077..645.410 rows=20359 loops=1)
  Hash Cond: (ti.task_ = task.id_)
  -  Hash Join  (cost=44779.80..47115.28 rows=14127 width=8) 
(actual time=604.874..640.541 rows=20359 loops=1)
Hash Cond: (ti.procinst_ = pi.id_)
-  Index Scan using idx_task_instance_isopen on 
jbpm_taskinstance ti  (cost=0.00..1995.84 rows=22672 width=16) (actual 
time=0.011..16.606 rows=20359 loops=1)
  Index Cond: (isopen_ = true)
  Filter: isopen_
-  Hash  (cost=28274.91..28274.91 rows=1320391 width=8) 
(actual time=604.601..604.601 rows=1320391 loops=1)
  Buckets: 262144  Batches: 1  Memory Usage: 51578kB
  -  Seq Scan on jbpm_processinstance pi 
(cost=0.00..28274.91 rows=1320391 width=8) (actual time=0.004..192.166 
rows=1320391 loops=1)
  -  Hash  (cost=18.75..18.75 rows=675 width=8) (actual 
time=0.196..0.196 rows=675 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 27kB
-  Seq Scan on jbpm_task task  (cost=0.00..18.75 
rows=675 width=8) (actual time=0.003..0.106 rows=675 loops=1)
  Total runtime: 652.266 ms
(15 rows)


I'm not sure why the planner insists on doing the sequential scan on  
jbpm_processinstance even though the 22672 rows from jbpm_taskinstance 
it has to match it against, is only 1% of the number of rows in 
jbpm_processinstance. So far I think it is because the values in 
procinst_ of jbpm_taskinstance are not entirely unique.

The very strange thing though is the way the query plan changes if I 
repeat the where clause :

explain analyze select count(1) from jbpmprocess.jbpm_taskinstance ti 
join jbpmprocess.jbpm_task task on (ti.task_ = task.id_ ) join 
jbpmprocess.jbpm_processinstance pi on ti.procinst_ = pi.id_ where 
ti.isopen_ = true and ti.isopen_ = true;
QUERY PLAN
--
  Aggregate  (cost=2074.61..2074.62 rows=1 width=0) (actual 
time=80.126..80.126 rows=1 loops=1)
-  Hash Join  (cost=27.19..2074.24 rows=151 width=0) (actual 
time=0.217..77.959 rows=20359 loops=1)
  Hash Cond: (ti.task_ = task.id_)
  -  Nested Loop  (cost=0.00..2044.97 rows=151 width=8) (actual 
time=0.016..71.429 rows=20359 loops=1)
-  Index Scan using idx_task_instance_isopen on 
jbpm_taskinstance ti  (cost=0.00..29.72 rows=243 width=16) (actual 
time=0.012..16.928 rows=20359 loops=1)
  Index Cond: ((isopen_ = true) AND (isopen_ = true))
  Filter: (isopen_ AND isopen_)
-  Index Scan using jbpm_processinstance_pkey on 
jbpm_processinstance pi  (cost=0.00..8.28 rows=1 width=8) (actual 
time=0.002..0.002 rows=1 loops=20359)
  Index Cond: (id_ = ti.procinst_)
  -  Hash  (cost=18.75..18.75 rows=675 width=8) (actual 
time=0.196..0.196 rows=675 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 27kB
-  Seq Scan on jbpm_task task  (cost=0.00..18.75 
rows=675 width=8) (actual time=0.002..0.107 rows=675 loops=1)
  Total runtime: 80.170 ms

I get a similar plan selected on the original query if I set 
enable_seqscan to off. I much prefer the second result.
My questions are:
1. Why is this happening?
2. How can I encourage the behavior of the second query without changing 
the original query? Is there some column level setting I can set?

(BTW the tables are analyzed, and I currently have no special 
settings/attributes set for any of the tables.)

-- 
Kind Regards
Stefan

Cell : 072-380-1479
Desk : 087-577-7241
To read FirstRand Bank's Disclaimer for this email click on the following 
address or copy into your Internet browser: 
https://www.fnb.co.za/disclaimer.html 

If you are unable to access the Disclaimer, send a blank e-mail to
firstrandbankdisclai...@fnb.co.za and we will send you a copy of the Disclaimer.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Seqscan on big table, when an Index-Usage should be possible

2014-06-05 Thread Weinzierl Stefan


Hello,

I'm currently testing some queries on data which I had imported from an 
other database-system into Postgres 9.4.


After the import I did create the indexes, run an analyze and vacuum. I 
also played a little bit with seq_page_cost and random_page_cost. But 
currently I have no clue, which parameter I have to adjust, to get an 
query-time like the example width 'enable_seqscan=off'.


Stefan




pd= set enable_seqscan=off;
pd= explain analyze select t.name from product p left join measurements m on 
p.productid=m.productid inner join measurementstype t on 
m.measurementstypeid=t.measurementstypeid where p.timestamp between '2013-02-01 
15:00:00' and '2013-02-05 21:30:00' group by t.name;

 QUERY PLAN

HashAggregate  (cost=200380892.01..200380936.43 rows=4442 width=16) (actual 
time=34428.335..34428.693 rows=656 loops=1)
   Group Key: t.name
   -  Hash Join  (cost=8995.44..200361772.19 rows=7647926 width=16) (actual 
time=103.670..30153.958 rows=5404751 loops=1)
 Hash Cond: (m.measurementstypeid = t.measurementstypeid)
 -  Nested Loop  (cost=8279.61..200188978.03 rows=7647926 width=4) 
(actual time=75.939..22488.725 rows=5404751 loops=1)
   -  Bitmap Heap Scan on product p  (cost=8279.03..662659.76 
rows=526094 width=8) (actual time=75.903..326.850 rows=368494 loops=1)
 Recheck Cond: ((timestamp = '2013-02-01 15:00:00'::timestamp without 
time zone) AND (timestamp = '2013-02-05 21:30:00'::timestamp without time zo
 Heap Blocks: exact=3192
 -  Bitmap Index Scan on product_timestamp  
(cost=0.00..8147.51 rows=526094 width=0) (actual time=75.050..75.050 rows=368494 
loops=1)
   Index Cond: ((timestamp = '2013-02-01 15:00:00'::timestamp 
without time zone) AND (timestamp = '2013-02-05 21:30:00'::timestamp without tim
   -  Index Scan using measurements_productid on measurements m  
(cost=0.58..347.12 rows=3214 width=12) (actual time=0.018..0.045 rows=15 
loops=368494)
 Index Cond: (productid = p.productid)
 -  Hash  (cost=508.91..508.91 rows=16554 width=20) (actual 
time=27.704..27.704 rows=16554 loops=1)
   Buckets: 2048  Batches: 1  Memory Usage: 686kB
   -  Index Scan using measurementstype_pkey on measurementstype t 
 (cost=0.29..508.91 rows=16554 width=20) (actual time=0.017..15.719 rows=16554 
loops=1)
Planning time: 2.176 ms
Execution time: 34429.080 ms
(17 Zeilen)


Zeit: 34432,187 ms
pd= set enable_seqscan=on;
SET
Zeit: 0,193 ms
pd= explain analyze select t.name from product p left join measurements m on 
p.productid=m.productid inner join measurementstype t on 
m.measurementstypeid=t.measurementstypeid where p.timestamp between '2013-02-01 
15:00:00' and '2013-02-05 21:30:00' group by t.name;

QUERY PLAN

HashAggregate  (cost=108645282.49..108645326.91 rows=4442 width=16) (actual 
time=5145182.269..5145182.656 rows=656 loops=1)
   Group Key: t.name
   -  Hash Join  (cost=671835.40..108626162.68 rows=7647926 width=16) (actual 
time=2087822.232..5141351.539 rows=5404751 loops=1)
 Hash Cond: (m.measurementstypeid = t.measurementstypeid)
 -  Hash Join  (cost=671291.94..108453540.88 rows=7647926 width=4) 
(actual time=2087800.816..5134312.822 rows=5404751 loops=1)
   Hash Cond: (m.productid = p.productid)
   -  Seq Scan on measurements m  (cost=0.00..49325940.08 
rows=2742148608 width=12) (actual time=0.007..2704591.045 rows=2742146806 loops=1)
   -  Hash  (cost=662659.76..662659.76 rows=526094 width=8) 
(actual time=552.480..552.480 rows=368494 loops=1)
 Buckets: 16384  Batches: 4  Memory Usage: 2528kB
 -  Bitmap Heap Scan on product p  
(cost=8279.03..662659.76 rows=526094 width=8) (actual time=73.353..302.482 
rows=368494 loops=1)
   Recheck Cond: ((timestamp = '2013-02-01 15:00:00'::timestamp 
without time zone) AND (timestamp = '2013-02-05 21:30:00'::timestamp without t
   Heap Blocks: exact=3192
   -  Bitmap Index Scan on product_timestamp  
(cost=0.00..8147.51 rows=526094 width=0) (actual time=72.490..72.490 rows=368494 
loops=1)
 Index Cond: ((timestamp = '2013-02-01 
15:00:00'::timestamp without time zone) AND (timestamp = '2013-02-05 21:30:00'::timestamp 
witho
 -  Hash  (cost=336.54..336.54 rows=16554

Re: [PERFORM] Getting query plan alternatives from query planner?

2014-04-14 Thread Stefan Keller

Hi Craig and Shawn

I fully agree with your argumentation.
Who's the elephant in the room who is reluctant to introduce explicit hints?

-S.


2014-04-14 17:35 GMT+02:00 Craig James cja...@emolecules.com:

 Shaun Thomas stho...@optionshouse.com wrote:


  these issues tend to get solved through optimization fences.
 Reorganize a query into a CTE, or use the (gross) OFFSET 0 trick.
 How are these nothing other than unofficial hints?

 Yeah, the cognitive dissonance levels get pretty high around this
 issue.  Some of the same people who argue strenuously against
 adding hints about what plan should be chosen also argue against
 having clearly equivalent queries optimize to the same plan because
 they find the fact that they don't useful for coercing a decent
 plan sometimes.  That amounts to a hint, but obscure and
 undocumented.  (The OP may be wondering what this OFFSET 0 trick
 is, and how he can use it.)


 +1. I've said this or something like it at least a half-dozen times.
 Postgres DOES have hints, they're just obscure, undocumented and hard to
 use. If a developer chooses to use them, they become embedded in the app
 and forgotten. They're hard to find because there's nothing explicit in the
 SQL to look for. You have to know to look for things like OFFSET or SET
  Five years down the road when the developer is long gone, who's going
 to know why ... OFFSET 0 was put in the code unless the developer made
 careful comments?


 With explicit, documented hints, one could search for hints of a
 particular type should the optimizer improve to the point where
 they are no longer needed.  It is harder to do that with subtle
 differences in syntax choice.  Figuring out which CTEs or LIMITs
 were chosen because they caused optimization barriers rather than
 for their semantic merit takes some effort.


 Exactly.

 I'll make a bet here. I'll bet that the majority of large Postgres
 installations have at least one, probably several, SQL statements that have
 been hinted in some way, either with CTEs or LIMITs, or by using SET to
 disable a particular query type, and that these hints are critical to the
 system's performance.

 The question is not whether to have hints. The question is how to expose
 hints to users.

 Craig

Re: [PERFORM] slow join not using index properly

2014-03-24 Thread Stefan Amshey

 to get past, so I think it might be
invalid:

 ERROR:  syntax error at or near UNION
 LINE 8: UNION (
 ^


So I landed on the version that I posted above, which seems to select the
same set in all of the cases that I tried.

Anyway, thanks again for taking a stab at helping, I do appreciate it. If
you have any other ideas that might be of help I'd certainly be happy to
hear them.

Take care,
 /Stefan




On Thu, Mar 20, 2014 at 11:02 PM, Ilya Kosmodemiansky 
ilya.kosmodemian...@postgresql-consulting.com wrote:

 Hi Stefan!

 Probably you need to rewrite your query like this (check it first):

 with RECURSIVE qq(cont_key, anc_key) as
 (
 select min(a1.context_key), ancestor_key from virtual_ancestors a1
  union select
   (SELECT
 a1.context_key, ancestor_key
   FROM
 virtual_ancestors a1 where context_key  cont_key order by
 context_key limit 1) from qq where cont_key is not null
 )
 select a1.cont_key
  from qq a1, collection_data, virtual_ancestors a2
 WHERE
 a1.anc_key =  collection_data.context_key
 AND collection_data.collection_context_key = a2.context_key
 AND a2.ancestor_key = ?

 best regards,
 Ilya

 On Fri, Mar 21, 2014 at 12:56 AM, Stefan Amshey srams...@gmail.com
 wrote:
  We have a slow performing query that we are trying to improve, and it
  appears to be performing a sequential scan at a point where it should be
  utilizing an index. Can anyone tell me why postgres is opting to do it
 this
  way?
 
  The original query is as follows:
 
  SELECT DISTINCT
  a1.context_key
  FROM
  virtual_ancestors a1, collection_data, virtual_ancestors a2
  WHERE
  a1.ancestor_key =  collection_data.context_key
  AND collection_data.collection_context_key = a2.context_key
  AND a2.ancestor_key = ?
 
  The key relationships should all using indexed columns, but the query
 plan
  that postgres comes up with ends up performing a sequential scan on the
  collection_data table (in this case about 602k rows) where we would have
  expected it to utilize the index:
 
   HashAggregate  (cost=60905.73..60935.73 rows=3000 width=4) (actual
  time=3366.165..3367.354 rows=3492 loops=1)
 Buffers: shared hit=16291 read=1222
 -  Nested Loop  (cost=17546.26..60898.23 rows=3000 width=4) (actual
  time=438.332..3357.918 rows=13037 loops=1)
   Buffers: shared hit=16291 read=1222
   -  Hash Join  (cost=17546.26..25100.94 rows=98 width=4) (actual
  time=408.554..415.767 rows=2092 loops=1)
 Hash Cond: (a2.context_key =
  collection_data.collection_context_key)
 Buffers: shared hit=4850 read=3
 -  Index Only Scan using virtual_ancestors_pkey on
  virtual_ancestors a2  (cost=0.00..233.32 rows=270 width=4) (actual
  time=8.532..10.703 rows=1960 loops=1)
   Index Cond: (ancestor_key = 1072173)
   Heap Fetches: 896
   Buffers: shared hit=859 read=3
 -  Hash  (cost=10015.56..10015.56 rows=602456 width=8)
  (actual time=399.708..399.708 rows=602570 loops=1)
   Buckets: 65536  Batches: 1  Memory Usage: 23538kB
   Buffers: shared hit=3991
   sequential scan occurs here ##
   -  Seq Scan on collection_data
  (cost=0.00..10015.56
  rows=602456 width=8) (actual time=0.013..163.509 rows=602570 loops=1)
 Buffers: shared hit=3991
   -  Index Only Scan using virtual_ancestors_pkey on
  virtual_ancestors a1  (cost=0.00..360.70 rows=458 width=8) (actual
  time=1.339..1.403 rows=6 loops=2092)
 Index Cond: (ancestor_key = collection_data.context_key)
 Heap Fetches: 7067
 Buffers: shared hit=11441 read=1219
   Total runtime: 3373.058 ms
 
 
  The table definitions are as follows:
 
Table public.virtual_ancestors
  Column|   Type   | Modifiers
  --+--+---
   ancestor_key | integer  | not null
   context_key  | integer  | not null
   degree   | smallint | not null
  Indexes:
  virtual_ancestors_pkey PRIMARY KEY, btree (ancestor_key,
 context_key)
  virtual_context_key_idx btree (context_key)
  Foreign-key constraints:
  virtual_ancestors_ancestor_key_fkey FOREIGN KEY (ancestor_key)
  REFERENCES contexts(context_key)
  virtual_ancestors_context_key_fkey FOREIGN KEY (context_key)
  REFERENCES contexts(context_key)
 
   Table public.collection_data
   Column  | Type | Modifiers
  +--+---
   collection_context_key | integer  | not null
   context_key| integer  | not null
   type| character varying(1) | not null
   source| character varying(1) | not null
  Indexes:
  collection_data_context_key_idx btree (context_key

Re: [PERFORM] Getting query plan alternatives from query planner?

2014-03-21 Thread Stefan Keller

Hi Tom

You wrote:
 Path alternatives are rejected
 whenever possible before moving up to the next join level, so that what
 we have rejected is actually just a plan fragment in most cases.

Thanks for the quick answer. This sounds like a fair implementation decision.

Background for asking this is of course, that one want's 1. to
understand and 2. influence the optimizer in cases where one thinks
that the planner is wrong :-).

So, the bottom line is
1. that PostgreSQL doesn't offer no means to understand the planner
except EXPLAIN-ing the chosen plan?
2. and there's no road map to introduce planner hinting (like in
EnterpriseDB or Ora)?

Regards, Stefan

2014-03-20 18:08 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:
 Stefan Keller sfkel...@gmail.com writes:
 I'd like to know from the query planner which query plan alternatives
 have been generated and rejected. Is this possible?

 No, not really.  People have occasionally hacked the planner to print
 rejected paths before they're discarded, but there's no convenient way
 to do anything except send the data to the postmaster log, which isn't
 all that convenient.  A bigger problem is that people who are asking
 for this typically imagine that the planner generates complete plans
 before rejecting them; which it does not.  Path alternatives are rejected
 whenever possible before moving up to the next join level, so that what
 we have rejected is actually just a plan fragment in most cases.

 regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Getting query plan alternatives from query planner?

2014-03-20 Thread Stefan Keller

Hi,

I'd like to know from the query planner which query plan alternatives
have been generated and rejected. Is this possible?

--Stefan


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] FTS performance issue - planner problem identified (but only partially resolved)

2013-07-29 Thread Stefan Keller

Hi Kevin

Well, you're right :-) But my use cases are un-specific by design
since I'm using FTS as a general purpose function.

So I still propose to enhance the planner too as Tom Lane and your
colleague suggest based on repeated similar complaints [1].

Yours, Stefan

[1] 
http://www.postgresql.org/message-id/ca+tgmozgqbeu2kn305hwds+axw7yp0yn9vzwbsbwa8unst+...@mail.gmail.com


2013/7/29 Kevin Grittner kgri...@ymail.com:
 Stefan Keller sfkel...@gmail.com wrote:

 Finally, setting random_page_cost to 1 helps also - but I don't
 like this setting neither.

 Well, you should learn to like whichever settings best model your
 actual costs given your level of caching and your workload.  ;-)
 FWIW, I have found page costs less volatile and easier to tune
 with cpu_tuple_cost increased.  I just always start by bumping
 that to 0.03.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to properly index hstore tags column to faster search for keys

2013-07-19 Thread Stefan Keller

Hi Yuri and Radu-Stefan

I would'nt give too fast on PostgreSQL!
When looking at your query plan I wonder if one could reformulate the query
to compute the ST_DWithin first (assuming you have an index on the node
geometries!) before it filters the tags.
To investigate that you could formulate a CTE query [1] which computes the
ST_DWithin first.

Yours, Stefan

[1] http://www.postgresql.org/docs/9.2/static/queries-with.html

2013/7/8 Yuri Levinsky yu...@celltick.com

Dear Radu-Stefan,

It seems to me that you trying hard to solve a problem by SQL that
probably can't be solved. Take a look please on Apache HBase. You can
access HBase from PostgreSQL as well by utilizing Java or Python for
example.

** **

*Sincerely yours*,

** **

[image: Description: Celltick logo_highres]

Yuri Levinsky, DBA

Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel

Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222

** **

*From:* pgsql-performance-ow...@postgresql.org [mailto:
pgsql-performance-ow...@postgresql.org] *On Behalf Of *Radu-Stefan Zugravu
*Sent:* Monday, July 08, 2013 12:20 PM
*To:* Richard Huxton
*Cc:* pgsql-performance@postgresql.org

*Subject:* Re: [PERFORM] How to properly index hstore tags column to
faster search for keys

** **

Any improvement is welcomed. The overall performance of the application is
not very good. It takes about 200 seconds to compute a path for not so far
star and end points. I want to improve this query as much as I can.

How exactly should I post the explain without the index? Do I have to drop
all created indexes for the tags column? It takes some time to create them
back.

** **

On Mon, Jul 8, 2013 at 11:53 AM, Richard Huxton d...@archonet.com wrote:*
***

On 08/07/13 09:31, Radu-Stefan Zugravu wrote:

Hi,
Thank you for your answer.
My EXPLAIN ANALYZE output can be found here:
http://explain.depesz.com/s/Wbo.

** **

Thanks

Also, there is a discution on this subject on dba.stackexchange.com

http://dba.stackexchange.com:

http://dba.stackexchange.com/questions/45820/how-to-properly-index-hstore-tags-column-to-faster-search-for-keys

Thanks - also useful to know.

I can't see anything wrong with your query. Reading it from the bottom
upwards:
1. Index used for historic search - builds a bitmap of blocks
2. Index used for geometry search - builds a bitmap of blocks
3. See where the bitmaps overlap (BitmapAnd)
4. Grab those disk blocks and find the rows (Bitmap Heap Scan)

The whole thing takes under 20ms - what sort of time were you hoping for?

The bulk of it (15ms) is taken up locating the historic rows. There are
36351 of those, but presumably most of them are far away on the map.

Could you post the explain without the index? I'm curious as to how slow
it is just testing the tags after doing the geometry search.

--
Richard Huxton
Archonet Ltd

** **

Radu-Stefan Zugravu

0755 950 145
0760 903 464
raduzugrav...@gmail.com
radu.zugr...@yahoo.com

This mail was received via Mail-SeCure System.

image002.jpg

[PERFORM] Re: FTS performance issue - planner problem identified (but only partially resolved)

2013-07-19 Thread Stefan Keller

Hi

Sorry, referring to GIST index in my mail before was no good idea.

The bottom line still is, that the query (as recommended by the docs)
and the planner don't choose the index which makes it slow - unless
the original query...

 select id,title,left(content,100)
 from fulltextsearch
 where plainto_tsquery('pg_catalog.english','good') @@
 to_tsvector('pg_catalog.english',content);

is reformulated by this

 select id,title,left(content,100)
 from fulltextsearch, plainto_tsquery('pg_catalog.english','good') query
 where query @@
 to_tsvector('pg_catalog.english',content);

... using default values for enable_seqscan and set random_page_cost.

Yours, S.


2013/7/19 Stefan Keller sfkel...@gmail.com:
 Hi

 At 2013/2/8 I wrote:
 I have problems with the performance of FTS in a query like this:

 SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
 plainto_tsquery('english', 'good');

 It's slow ( 30 sec.) for some GB (27886 html files, originally 73 MB 
 zipped).
 The planner obviously always chooses table scan

 Now, I've identified (but only partially resolved) the issue: Here are
 my comments:

 Thats the query in question (see commented log below):

 select id,title,left(content,100)
 from fulltextsearch
 where plainto_tsquery('pg_catalog.english','good') @@
 to_tsvector('pg_catalog.english',content);

 After having created the GIN index, the FTS query unexpectedly is fast
 because planner chooses Bitmap Index Scan. After the index
 statistics have been updated, the same query becomes slow. Only when
 using the trick with the function in the WHERE clause. I think GIST
 does'nt change anything.

 select id,title,left(content,100)
 from fulltextsearch, plainto_tsquery('pg_catalog.english','good') query
 where query @@ to_tsvector('pg_catalog.english',content);

 = This hint should mentioned in the docs!

 Then, setting enable_seqscan to off makes original query fast again.
 But that's a setting I want to avoid in a multi-user database.
 Finally, setting random_page_cost to 1 helps also - but I don't like
 this setting neither.

 = To me the planner should be updated to recognize immutable
 plainto_tsquery() function in the WHERE clause and choose Bitmap
 Index Scan at the first place.

 What do you think?

 Yours, Stefan


 
 Lets look at table fulltextsearch:

 movies=# \d fulltextsearch
   Table public.fulltextsearch
  Column  |  Type   |  Modifiers
 -+-+-
  id  | integer | not null default 
 nextval('fulltextsearch_id_seq'::regclass)
  docid   | integer | default 0
  title   | text|
  content | text| not null

 movies=# CREATE INDEX fulltextsearch_gincontent ON fulltextsearch
 USING gin(to_tsvector('pg_catalog.english',content));

 movies=# SELECT * FROM pg_class c WHERE relname LIKE 'fullt%';
   oid   |   name| kind |   tuples| pages |
 allvisible | toastrelid | hasindex
 +---+--+-+---+++--
  476289 | fulltextsearch| r|   27886 |   555 |
  0 | 476293 | t
  503080 | fulltextsearch_gincontent | i| 8.97135e+06 | 11133 |
  0 |  0 | f
  476296 | fulltextsearch_id_seq | S|   1 | 1 |
  0 |  0 | f
  503075 | fulltextsearch_pkey   | i|   27886 |79 |
  0 |  0 | f
 (4 rows)

 = fulltextsearch_gincontent has an arbitrary large number of tuples
 (statistics is wrong and not yet updated)

 movies=#
 explain (analyze,costs,timing,buffers)
 select id,title,left(content,100)
 from fulltextsearch
 where plainto_tsquery('pg_catalog.english','good') @@
 to_tsvector('pg_catalog.english',content);
 = Unexpectedly, the query is fast!
 See query plan http://explain.depesz.com/s/ewn

 Let's update the statistics:

 movies=# VACUUM ANALYZE VERBOSE fulltextsearch ;

 SELECT * FROM pg_class c WHERE relname LIKE 'fullt%';
   oid   |   name| kind | tuples | pages |
 allvisible | toastrelid | hasindex
 +---+--++---+++--
  476289 | fulltextsearch| r|  27886 |   555 |
 555 | 476293 | t
  503080 | fulltextsearch_gincontent | i|  27886 | 11133 |
 0 |  0 | f
  476296 | fulltextsearch_id_seq | S|  1 | 1 |
 0 |  0 | f
  503075 | fulltextsearch_pkey   | i|  27886 |79 |
 0 |  0 | f
 (4 rows)

 = Now after having update statistics (see especially tuples of
 fulltextsearch_gincontent ) the original query is slow!
 See query plan http://explain.depesz.com/s/MQ60

 Now, let's reformulate the original query and move the function call
 to plainto_tsquery to the FROM clause:

 movies=# explain (analyze,costs,timing,buffers)
 select id,title,left(content,100)
 from fulltextsearch, plainto_tsquery('pg_catalog.english','good

Re: [PERFORM] FTS performance issue - planner problem identified (but only partially resolved)

2013-07-19 Thread Stefan Keller

Hi Marc

Thanks a lot for your hint!

You mean doing a SET track_counts (true); for the whole session?
That would be ok if it would be possible just for the gin index.

It's obviously an issue of the planner estimation costs.
The data I'm speaking about (movies) has a text attribute which has
a length of more than 8K so it's obviously having to do with
detoasting.
But the thoughts about @@ operators together with this GIN index seem
also to be valid.

I hope this issue is being tracked in preparation for 9.3.

Regards, Stefan

2013/7/19 Marc Mamin m.ma...@intershop.de:

SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
plainto_tsquery('english', 'good');

It's slow ( 30 sec.) for some GB (27886 html files, originally 73 MB
zipped).
The planner obviously always chooses table scan

Hello,

A probable reason for the time difference is the cost for decompressing
toasted content.
At least in 8.3, the planner was not good at estimating it.

I'm getting better overall performances since I've stopped collect statistic
on tsvectors.
An alternative would have been to disallow compression on them.

I'm aware this is a drastic way and would not recommend it without testing.
The benefit may depend on the type of data you are indexing.
In our use case these are error logs with many java stack traces, hence with
many lexemes poorly discriminative.

see: http://www.postgresql.org/message-id/27953.1329434...@sss.pgh.pa.us
as a comment on
http://www.postgresql.org/message-id/c4dac901169b624f933534a26ed7df310861b...@jenmail01.ad.intershop.net

regards,

Marc Mamin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] FTS performance issue - planner problem identified (but only partially resolved)

2013-07-18 Thread Stefan Keller

Hi

At 2013/2/8 I wrote:
 I have problems with the performance of FTS in a query like this:

 SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
 plainto_tsquery('english', 'good');

 It's slow ( 30 sec.) for some GB (27886 html files, originally 73 MB zipped).
 The planner obviously always chooses table scan

Now, I've identified (but only partially resolved) the issue: Here are
my comments:

Thats the query in question (see commented log below):

select id,title,left(content,100)
from fulltextsearch
where plainto_tsquery('pg_catalog.english','good') @@
to_tsvector('pg_catalog.english',content);

After having created the GIN index, the FTS query unexpectedly is fast
because planner chooses Bitmap Index Scan. After the index
statistics have been updated, the same query becomes slow. Only when
using the trick with the function in the WHERE clause. I think GIST
does'nt change anything.

select id,title,left(content,100)
from fulltextsearch, plainto_tsquery('pg_catalog.english','good') query
where query @@ to_tsvector('pg_catalog.english',content);

= This hint should mentioned in the docs!

Then, setting enable_seqscan to off makes original query fast again.
But that's a setting I want to avoid in a multi-user database.
Finally, setting random_page_cost to 1 helps also - but I don't like
this setting neither.

= To me the planner should be updated to recognize immutable
plainto_tsquery() function in the WHERE clause and choose Bitmap
Index Scan at the first place.

What do you think?

Yours, Stefan



Lets look at table fulltextsearch:

movies=# \d fulltextsearch
  Table public.fulltextsearch
 Column  |  Type   |  Modifiers
-+-+-
 id  | integer | not null default nextval('fulltextsearch_id_seq'::regclass)
 docid   | integer | default 0
 title   | text|
 content | text| not null

movies=# CREATE INDEX fulltextsearch_gincontent ON fulltextsearch
USING gin(to_tsvector('pg_catalog.english',content));

movies=# SELECT * FROM pg_class c WHERE relname LIKE 'fullt%';
  oid   |   name| kind |   tuples| pages |
allvisible | toastrelid | hasindex
+---+--+-+---+++--
 476289 | fulltextsearch| r|   27886 |   555 |
 0 | 476293 | t
 503080 | fulltextsearch_gincontent | i| 8.97135e+06 | 11133 |
 0 |  0 | f
 476296 | fulltextsearch_id_seq | S|   1 | 1 |
 0 |  0 | f
 503075 | fulltextsearch_pkey   | i|   27886 |79 |
 0 |  0 | f
(4 rows)

= fulltextsearch_gincontent has an arbitrary large number of tuples
(statistics is wrong and not yet updated)

movies=#
explain (analyze,costs,timing,buffers)
select id,title,left(content,100)
from fulltextsearch
where plainto_tsquery('pg_catalog.english','good') @@
to_tsvector('pg_catalog.english',content);
= Unexpectedly, the query is fast!
See query plan http://explain.depesz.com/s/ewn

Let's update the statistics:

movies=# VACUUM ANALYZE VERBOSE fulltextsearch ;

SELECT * FROM pg_class c WHERE relname LIKE 'fullt%';
  oid   |   name| kind | tuples | pages |
allvisible | toastrelid | hasindex
+---+--++---+++--
 476289 | fulltextsearch| r|  27886 |   555 |
555 | 476293 | t
 503080 | fulltextsearch_gincontent | i|  27886 | 11133 |
0 |  0 | f
 476296 | fulltextsearch_id_seq | S|  1 | 1 |
0 |  0 | f
 503075 | fulltextsearch_pkey   | i|  27886 |79 |
0 |  0 | f
(4 rows)

= Now after having update statistics (see especially tuples of
fulltextsearch_gincontent ) the original query is slow!
See query plan http://explain.depesz.com/s/MQ60

Now, let's reformulate the original query and move the function call
to plainto_tsquery to the FROM clause:

movies=# explain (analyze,costs,timing,buffers)
select id,title,left(content,100)
from fulltextsearch, plainto_tsquery('pg_catalog.english','good') query
where query @@ to_tsvector('pg_catalog.english',content);
= This special query is fast again!  See query plan
http://explain.depesz.com/s/FVT

Setting enable_seqscan to off makes query fast again: See query plan
http://explain.depesz.com/s/eOr

Finally, setting random_page_cost to 1 helps also (default is 4):

movies=# set enable_seqscan to default;
movies=# set random_page_cost to 1.0;
= Query is fast. See query plan http://explain.depesz.com/s/M5Ke




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to properly index hstore tags column to faster search for keys

2013-07-08 Thread Radu-Stefan Zugravu

Hi,
Thank you for your answer.
My EXPLAIN ANALYZE output can be found here: http://explain.depesz.com/s/Wbo
.
Also, there is a discution on this subject on dba.stackexchange.com:
http://dba.stackexchange.com/questions/45820/how-to-properly-index-hstore-tags-column-to-faster-search-for-keys


On Mon, Jul 8, 2013 at 10:44 AM, Richard Huxton d...@archonet.com wrote:

 On 07/07/13 08:28, Radu-Stefan Zugravu wrote:

 Each node has a geometry column called geom and a hstore column
 called tags. I need to extract nodes along a line that have certain
 keys in the tags column. To do that I use the following query:


  SELECT id, tags
 FROM nodes
 WHERE ST_DWithin(nodes.geom, ST_MakeLine('{$geom1}', '{$geom2}'), 0.001)
 AND tags ? '{$type}';


  CREATE  INDEX  nodes_tags_idx ON nodes USING GIN(tags);

 After creating the index I searched again for nodes using the same first
 query but there is no change in performance.

 How can I properly use GIN and GIST to index tags column so I can faster
 search for nodes that have a certain key in tags column?


 Your index definition looks OK. Try showing the output of EXPLAIN ANALYSE
 for your query - that way we'll see if the index is being used. You can
 always paste explain output to: http://explain.depesz.com/ if it's too
 long for the email.

 --
   Richard Huxton
   Archonet Ltd




-- 
Radu-Stefan Zugravu
0755 950 145
0760 903 464
raduzugrav...@gmail.com
radu.zugr...@yahoo.com

Re: [PERFORM] How to properly index hstore tags column to faster search for keys

2013-07-08 Thread Radu-Stefan Zugravu

I do call the query for each neighbour node to find which one is better in
building my path.
I think I will try the first way you mentioned. I also found some
references using BTREE indexes:

CREATE INDEX nodes_tags_btree_historic_idx on nodes USING BTREE ((tags ?
'historic'));
CREATE INDEX nodes_tags_btree_tourist_idx on nodes USING BTREE ((tags ?
'tourist));

Do you think this could make a difference?


On Mon, Jul 8, 2013 at 1:27 PM, Richard Huxton d...@archonet.com wrote:

 On 08/07/13 10:20, Radu-Stefan Zugravu wrote:

 Any improvement is welcomed. The overall performance of the application
 is not very good. It takes about 200 seconds to compute a path for not
 so far star and end points.


 So you have to call this query 1000 times with different start and end
 points?


  I want to improve this query as much as I can.

 There's only two ways I can see to get this much below 20ms. This will
 only work if you want a very restricted range of tags.

 Drop the tag index and create multiple geometry indexes instead:

 CREATE INDEX node_geo_tourist_idx index details WHERE tags ? 'tourist';
 CREATE INDEX node_geo_tourist_idx index details WHERE tags ? 'history';
 etc.

 This will only work if you have a literal WHERE clause that checks the
 tag. It should be fast though.


 The second way would be to delete all the nodes that aren't tagged tourist
 or history. That assumes you are never interested in them of course.


  How exactly should I post the explain without the index? Do I have to
 drop all created indexes for the tags column? It takes some time to
 create them back.


 Not important - I was just curious.


 --
   Richard Huxton
   Archonet Ltd




-- 
Radu-Stefan Zugravu
0755 950 145
0760 903 464
raduzugrav...@gmail.com
radu.zugr...@yahoo.com

[PERFORM] How to properly index hstore tags column to faster search for keys

2013-07-07 Thread Radu-Stefan Zugravu

I imported a large area of OpenStreetMap's planet.osm file into a
postgresql database. The database contains a table called nodes. Each node
has a geometry column called geom and a hstore column called tags. I need
to extract nodes along a line that have certain keys in the tags column. To
do that I use the following query:

SELECT id, tags  FROM nodes  WHERE ST_DWithin(nodes.geom,
ST_MakeLine('{$geom1}', '{$geom2}'), 0.001)  AND tags ? '{$type}';

$geom1 and $geom2 are geometries for start and end points of my line.
The $type variable contains the key I want to search for. Now, it can have
one of the following values: 'historic' or 'tourist'.

The query given above works but it is too slow. I guess searching for a key
in tags column takes too much time. I read about GIN and GIST indexes and I
generated a GIN index using the following query:

CREATE INDEX nodes_tags_idx ON nodes USING GIN(tags);

After creating the index I searched again for nodes using the same first
query but there is no change in performance.

How can I properly use GIN and GIST to index tags column so I can faster
search for nodes that have a certain key in tags column?

Thank you,

Radu-Stefan

[PERFORM]

2013-05-18 Thread Stefan Keller

Hi,

I'm experiencing a very slow CTE query (see below).

When I split the three aggregationns into separate views, its' decent
fast. So I think it's due to the planner.

Any ideas how to reformulate the query?

These are the tables and views involved:
* Table promotion with start/end date and a region, and table
promo2mission (each 1 to dozen tupels).
* View  all_errors (more than 20'000 tubles, based on table errors
without tupels from table fix)
* Table error_type (7 tupels)

Here's the EXPLAIN ANALYZE log: http://explain.depesz.com/s/tbF

Yours, Stefan


CTE Query:

WITH aggregation1
 AS (SELECT p.id   AS promo_id,
p.startdate,
p.enddate,
p.geom AS promogeom,
pm.error_type,
pm.mission_extra_coins AS extra_coins
 FROM   (promotion p
 join promo2mission pm
   ON (( p.id = pm.promo_id )))
 WHERE  ( ( p.startdate = Now() )
  AND ( p.enddate = Now() ) )),
 aggregation2
 AS (SELECT e.error_id AS missionid,
e.schemaid,
t.TYPE,
e.osm_id,
e.osm_type,
t.description  AS title,
t.view_type,
t.answer_placeholder,
t.bug_question AS description,
t.fix_koin_count,
t.vote_koin_count,
e.latitude,
e.longitude,
e.geom AS missiongeom,
e.txt1,
e.txt2,
e.txt3,
e.txt4,
e.txt5
 FROM   all_errors e,
error_type t
 WHERE  ( ( e.error_type_id = t.error_type_id )
  AND ( NOT ( EXISTS (SELECT 1
  FROM   fix f
  WHERE  ( ( ( ( f.error_id = e.error_id )
   AND ( f.osm_id =
e.osm_id ) )
 AND ( ( f.schemaid ) :: text =
 ( e.schemaid ) :: text ) )
   AND ( ( f.complete
   AND f.valid )
  OR ( NOT
f.complete ) ) )) ) ) )),
 aggregation3
 AS (SELECT ag2.missionid AS missionidtemp,
ag1.promo_id,
ag1.extra_coins
 FROM   (aggregation2 ag2
 join aggregation1 ag1
   ON (( ( ag2.TYPE ) :: text = ( ag1.error_type ) :: text )))
 WHERE  public._st_contains(ag1.promogeom, ag2.missiongeom))
SELECT ag2.missionid AS id,
   ag2.schemaid,
   ag2.TYPE,
   ag2.osm_id,
   ag2.osm_type,
   ag2.title,
   ag2.description,
   ag2.latitude,
   ag2.longitude,
   ag2.view_type,
   ag2.answer_placeholder,
   ag2.fix_koin_count,
   ag2.missiongeom,
   ag2.txt1,
   ag2.txt2,
   ag2.txt3,
   ag2.txt4,
   ag2.txt5,
   ag3.promo_id,
   ag3.extra_coins
FROM   (aggregation2 ag2
left join aggregation3 ag3
   ON (( ag2.missionid = ag3.missionidtemp )));


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Slow CTE Query

2013-05-18 Thread Stefan Keller

Hi,

I'm experiencing a very slow CTE query (see below).

When I split the three aggregations into three separate views, its' decent
fast. So I think it's due to the planner.

Any ideas like reformulating the query?

These are the tables and views involved:
* Table promotion with start/end date and a region, and table
promo2mission (each 1 to dozen tupels).
* View  all_errors (more than 20'000 tubles, based on table errors
without tupels from table fix)
* Table error_type (7 tupels)

Here's the EXPLAIN ANALYZE log: http://explain.depesz.com/s/tbF

Yours, Stefan


CTE Query:

WITH aggregation1
 AS (SELECT p.id   AS promo_id,
p.startdate,
p.enddate,
p.geom AS promogeom,
pm.error_type,
pm.mission_extra_coins AS extra_coins
 FROM   (promotion p
 join promo2mission pm
   ON (( p.id = pm.promo_id )))
 WHERE  ( ( p.startdate = Now() )
  AND ( p.enddate = Now() ) )),
 aggregation2
 AS (SELECT e.error_id AS missionid,
e.schemaid,
t.TYPE,
e.osm_id,
e.osm_type,
t.description  AS title,
t.view_type,
t.answer_placeholder,
t.bug_question AS description,
t.fix_koin_count,
t.vote_koin_count,
e.latitude,
e.longitude,
e.geom AS missiongeom,
e.txt1,
e.txt2,
e.txt3,
e.txt4,
e.txt5
 FROM   all_errors e,
error_type t
 WHERE  ( ( e.error_type_id = t.error_type_id )
  AND ( NOT ( EXISTS (SELECT 1
  FROM   fix f
  WHERE  ( ( ( ( f.error_id = e.error_id )
   AND ( f.osm_id =
e.osm_id ) )
 AND ( ( f.schemaid ) :: text =
 ( e.schemaid ) :: text ) )
   AND ( ( f.complete
   AND f.valid )
  OR ( NOT
f.complete ) ) )) ) ) )),
 aggregation3
 AS (SELECT ag2.missionid AS missionidtemp,
ag1.promo_id,
ag1.extra_coins
 FROM   (aggregation2 ag2
 join aggregation1 ag1
   ON (( ( ag2.TYPE ) :: text = ( ag1.error_type ) :: text )))
 WHERE  public._st_contains(ag1.promogeom, ag2.missiongeom))
SELECT ag2.missionid AS id,
   ag2.schemaid,
   ag2.TYPE,
   ag2.osm_id,
   ag2.osm_type,
   ag2.title,
   ag2.description,
   ag2.latitude,
   ag2.longitude,
   ag2.view_type,
   ag2.answer_placeholder,
   ag2.fix_koin_count,
   ag2.missiongeom,
   ag2.txt1,
   ag2.txt2,
   ag2.txt3,
   ag2.txt4,
   ag2.txt5,
   ag3.promo_id,
   ag3.extra_coins
FROM   (aggregation2 ag2
left join aggregation3 ag3
   ON (( ag2.missionid = ag3.missionidtemp )));


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [BUGS] BUG #8130: Hashjoin still gives issues

2013-05-01 Thread Stefan de Konink


Dear Tom,


On Wed, 1 May 2013, Tom Lane wrote:


What can we do to provide a bit more of information?


https://wiki.postgresql.org/wiki/Slow_Query_Questions

There is no particularly good reason to think this is a bug; please
take it up on pgsql-performance if you have more questions.


I beg to disagree, the performance of a select * query and the select b.id 
query are both hot. The result in a fundamentally different query plan 
(and performance). Combined with the recent bugfix regarding hash 
estimation, it gives me a good indication that there might be a bug.


I am not deep into the query optimiser of PostgreSQL but given the above 
same were different selections can change an entire query plan (and * is 
in fact out of the box 30 times faster than b.id) it does. When hash is 
disabled the entire query is -depending on the system checked- 2 to 
30x faster.



The original query:

select * from ambit_privateevent_calendars as a, ambit_privateevent as b, 
ambit_calendarsubscription as c, ambit_calendar as d where c.calendar_id = 
d.id and a.privateevent_id = b.id and c.user_id = 1270 and  c.calendar_id 
= a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, 12, 20) and not 
b.main_recurrence = true;


select b.id from ambit_privateevent_calendars as a, ambit_privateevent as 
b, ambit_calendarsubscription as c, ambit_calendar as d where c.calendar_id = 
d.id and a.privateevent_id = b.id and c.user_id = 1270 and  c.calendar_id 
= a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, 12, 20) and not 
b.main_recurrence = true;


(select * = select b.id, the star query is *fastest*)

We compare:
http://explain.depesz.com/s/jRx
http://explain.depesz.com/s/eKE


By setting set enable_hashjoin = off; performance in our entire
application increased 30 fold in throughput, which was a bit unexpected 
but highly appreciated. The result of the last query switch the mergejoin:


http://explain.depesz.com/s/AWB

It is also visible that after hashjoin is off, the b.id query is faster 
than the * query (what would be expected).



Our test machine is overbudgetted, 4x the memory of the entire database 
~4GB, and uses the PostgreSQL stock settings.



Stefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] FTS performance issue probably due to wrong planner estimate of detoasting

2013-02-08 Thread Stefan Keller

Hi Jesper and Pavel

Thx for your hints.
I'm rather reluctant in tuning with unwanted side effects, We'll see.
I have to setup my system and db again before I can try out your tricks.

Yours, Stefan

2013/2/8 Jesper Krogh jes...@krogh.cc:
 On 08/02/13 01:52, Stefan Keller wrote:

 Hi,

 I have problems with the performance of FTS in a query like this:

SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
 plainto_tsquery('english', 'good');

 It's slow ( 30 sec.) for some GB (27886 html files, originally 73 MB
 zipped).
 The planner obviously always chooses table scan:
 http://explain.depesz.com/s/EEE
 I have to check again, if I'm doing something wrong but I'm pretty
 sure it has to do with de-toasting and (wrong?) cost estimations.

 If you havent done it .. bump up statistics target on the column and
 re-analyze, see what that gives.

 I have also been playing with the cost-numbers in order to get it to favour
 an index-scan more often. That is lowering random_page_cost to be close to
 seq_page_cost, dependent on your system, the amount of memory, etc, then
 this can have negative side-effects on non-gin-queries.

 --
 Jesper


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] FTS performance issue probably due to wrong planner estimate of detoasting

2013-02-07 Thread Stefan Keller

Hi,

I have problems with the performance of FTS in a query like this:

  SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
plainto_tsquery('english', 'good');

It's slow ( 30 sec.) for some GB (27886 html files, originally 73 MB zipped).
The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE
I have to check again, if I'm doing something wrong but I'm pretty
sure it has to do with de-toasting and (wrong?) cost estimations.

I've seen some comments here saying that estimating detoasting costs
(especially with operator @@ and GIN index) is an open issue (since
years?).
And I found a nice blog here [1] which uses 9.2/9.1 and proposes to
disable sequential table scan (SET enable_seqscan off;). But this is
no option for me since other queries still need seqscan.
Can anyone tell me if is on some agenda here (e.g. as an open item for 9.2)?

Yours, Stefan

[1] 
http://palominodb.com/blog/2012/03/06/considerations-about-text-searchs-big-fields-and-planner-costs


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] serious under-estimation of n_distinct for clustered distributions

2013-01-13 Thread Stefan Andreatta


A status update on this problem:

1.) Workarounds (setting n_distinct manually) are tested and - as far as 
workarounds go - OK.


2.) Source of the problem and possible solution:

The source of these troubles is the sampling method employed in 
src/backend/commands/analyze.c. Judging from Tom Lane's comment for the 
original implementation in 2004 this has never been thought to be 
perfect. Does anybody see a chance to improve that part? Should this 
discussion be taken elsewhere? Is there any input from my side that 
could help?



btw: I do find this problem to be very frequent in our databases. And 
considering the commonplace conditions leading to it, I would expect 
many systems to be affected. But searching the forums and the web I 
hardly found any references to it - which amazes me to no end.



Best Regards,
Stefan



On 12/30/2012 07:02 PM, Stefan Andreatta wrote:

On 12/29/2012 10:57 PM, Peter Geoghegan wrote:
On 29 December 2012 20:57, Stefan Andreatta s.andrea...@synedra.com 
wrote:

Now, the 2005 discussion goes into great detail on the advantages and
disadvantages of this algorithm, particularly when using small 
sample sizes,
and several alternatives are discussed. I do not know whether 
anything has
been changed after that, but I know that the very distinct problem, 
which I

will focus on here, still persists.


It's a really hard problem to solve satisfactorily. It's a problem
that has been studied in much detail. Yes, the algorithm used is still
the same. See the comments within src/backend/commands/analyze.c (IBM
Research Report RJ 10025 is referenced there).


Thanks a lot for this information! I looked through the code a bit. 
The Haas  Stokes Formula is fine. The problem really lies with the 
two phase random selection procedure:


Starting from line 1039, there is a comment:
 * As of May 2004 we use a new two-stage method: Stage one selects up
 * to targrows random blocks (or all blocks, if there aren't so many).
 * Stage two scans these blocks and uses the Vitter algorithm to create
 * a random sample of targrows rows (or less, if there are less in the
 * sample of blocks). The two stages are executed simultaneously: each
 * block is processed as soon as stage one returns its number and while
 * the rows are read stage two controls which ones are to be inserted
 * into the sample.
 *
 * Although every row has an equal chance of ending up in the final
 * sample, this sampling method is not perfect: not every possible
 * sample has an equal chance of being selected. For large relations
 * the number of different blocks represented by the sample tends to be
 * too small. We can live with that for now. Improvements are welcome.


Now the problem with clustered data is, that the probability of 
sampling a value twice is much higher when the same page is repeatedly 
sampled. As stage one takes a random sample of pages, and stage two 
samples rows from these pages, the probability of visiting the same 
page twice (or more often) is much higher than if random rows were 
selected from the whole table. Hence we get a lot more multiple values 
for clustered data and we end up with the severe under-estimation we 
can see in those cases.


Probabilities do my brain in, as usual, but I tested the procedure for 
my test data with a simple python script. There is absolutely nothing 
wrong with the implementation. It seems to be a purely statistical 
problem.


Not everything may be hopeless though ;-) The problem could 
theoretically be avoided if random rows were selected from the whole 
table. Again, that may not be feasible - the two phase approach was 
probably not implemented for nothing.


Another possible solution would be to avoid much of the resampling 
(not all) in phase two. For that - in theory - every page visited 
would have to get a lower weight, so that revisiting this page is not 
any more likely as rows were selected from the whole column. That does 
not sound easy or elegant to implement. But perhaps there is some 
clever algorithm - unfortunately I do not know.




The general advice here is:

1) Increase default_statistics_target for the column.

2) If that doesn't help, consider using the following DDL:

alter table foo alter column bar set ( n_distinct = 5.0);


Yes, I will probably have to live with that for now - I will come back 
to these workarounds with one or two questions.


Thanks again  Regards,
Seefan






--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Simple join doesn't use index

2013-01-03 Thread Stefan Andreatta


On 01/03/2013 11:54 PM, Alex Vinnik wrote:

Don't understand why PG doesn't use views_visit_id_index in that query
but rather scans whole table. One explanation I have found that when
resulting dataset constitutes ~15% of total number of rows in the table
then seq scan is used. In this case resulting dataset is just 1.5% of
total number of rows. So it must be something different. Any reason why
it happens and how to fix it?


But does the query planner know the same? If you added the EXPLAIN 
ANALYZE output of the query and something like:


 SELECT tablename AS table_name, attname AS column_name,
null_frac, avg_width, n_distinct, correlation
 FROM pg_stats
 WHERE tablename in ('views', 'visits');

.. one could possibly tell a bit more.


Postgres 9.2
Ubuntu 12.04.1 LTS
shared_buffers = 4GB the rest of the settings are default ones


There are more than just this one memory related value, that need to be 
changed for optimal performance. E.g. effective_cache_size can have a 
direct effect on use of nested loops. See:


http://www.postgresql.org/docs/9.2/static/runtime-config-query.html

Regards,
Stefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] serious under-estimation of n_distinct for clustered distributions

2013-01-03 Thread Stefan Andreatta


On 12/29/2012 10:57 PM, Peter Geoghegan wrote:

On 29 December 2012 20:57, Stefan Andreatta s.andrea...@synedra.com wrote:

...


The general advice here is:

1) Increase default_statistics_target for the column.


I tried that, but to get good estimates under these circumstances, I 
need to set the statistics_target so high that the whole table gets 
analyzed. As this problem matters most for all of our large tables, I 
would have to set default_statistics_target to something like 10 - 
that's a bit scary for production systems with tables of appr. 100GB, I 
find.




2) If that doesn't help, consider using the following DDL:

alter table foo alter column bar set ( n_distinct = 5.0);



Yes, that's probably best - even if it means quite some maintenance 
work. I do it like that:


 ALTER TABLE test_1 ALTER COLUMN clustered_random_2000k SET (n_distinct 
= -0.05);


btw: Postgres will never set relative n_distinct values for anything 
larger than -0.1. If I determine (or know) it to be a constant but lower 
fraction, could it be a problem to explicitly set this value to between 
-0.1 and 0?



To activate that setting, however, an ANALYZE has to be run. That was 
not clear to me from the documentation:


 ANALYZE verbose test_1;


To check column options and statistics values:

 SELECT pg_class.relname AS table_name,
pg_attribute.attname AS column_name, pg_attribute.attoptions
 FROM pg_attribute
 JOIN pg_class ON pg_attribute.attrelid = pg_class.oid
 WHERE pg_attribute.attnum  0
 AND pg_class.relname = 'test_1'
 AND pg_attribute.attname = 'clustered_random_2000k';

 SELECT tablename AS table_name, attname AS column_name,
null_frac, avg_width, n_distinct, correlation
 FROM pg_stats
 WHERE tablename = 'test_1' and attname = 'clustered_random_2000k';


And finally, we can undo the whole thing, if necessary:

 ALTER TABLE test_1 ALTER COLUMN clustered_random_2000k RESET (n_distinct);
 ANALYZE VERBOSE test_1;


Regards,
Stefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] serious under-estimation of n_distinct for clustered distributions

2012-12-30 Thread Stefan Andreatta


On 12/29/2012 10:57 PM, Peter Geoghegan wrote:

On 29 December 2012 20:57, Stefan Andreatta s.andrea...@synedra.com wrote:

Now, the 2005 discussion goes into great detail on the advantages and
disadvantages of this algorithm, particularly when using small sample sizes,
and several alternatives are discussed. I do not know whether anything has
been changed after that, but I know that the very distinct problem, which I
will focus on here, still persists.


It's a really hard problem to solve satisfactorily. It's a problem
that has been studied in much detail. Yes, the algorithm used is still
the same. See the comments within src/backend/commands/analyze.c (IBM
Research Report RJ 10025 is referenced there).


Thanks a lot for this information! I looked through the code a bit. The 
Haas  Stokes Formula is fine. The problem really lies with the two 
phase random selection procedure:


Starting from line 1039, there is a comment:
 * As of May 2004 we use a new two-stage method: Stage one selects up
 * to targrows random blocks (or all blocks, if there aren't so many).
 * Stage two scans these blocks and uses the Vitter algorithm to create
 * a random sample of targrows rows (or less, if there are less in the
 * sample of blocks). The two stages are executed simultaneously: each
 * block is processed as soon as stage one returns its number and while
 * the rows are read stage two controls which ones are to be inserted
 * into the sample.
 *
 * Although every row has an equal chance of ending up in the final
 * sample, this sampling method is not perfect: not every possible
 * sample has an equal chance of being selected. For large relations
 * the number of different blocks represented by the sample tends to be
 * too small. We can live with that for now. Improvements are welcome.


Now the problem with clustered data is, that the probability of sampling 
a value twice is much higher when the same page is repeatedly sampled. 
As stage one takes a random sample of pages, and stage two samples rows 
from these pages, the probability of visiting the same page twice (or 
more often) is much higher than if random rows were selected from the 
whole table. Hence we get a lot more multiple values for clustered data 
and we end up with the severe under-estimation we can see in those cases.


Probabilities do my brain in, as usual, but I tested the procedure for 
my test data with a simple python script. There is absolutely nothing 
wrong with the implementation. It seems to be a purely statistical problem.


Not everything may be hopeless though ;-) The problem could 
theoretically be avoided if random rows were selected from the whole 
table. Again, that may not be feasible - the two phase approach was 
probably not implemented for nothing.


Another possible solution would be to avoid much of the resampling (not 
all) in phase two. For that - in theory - every page visited would have 
to get a lower weight, so that revisiting this page is not any more 
likely as rows were selected from the whole column. That does not sound 
easy or elegant to implement. But perhaps there is some clever algorithm 
- unfortunately I do not know.




The general advice here is:

1) Increase default_statistics_target for the column.

2) If that doesn't help, consider using the following DDL:

alter table foo alter column bar set ( n_distinct = 5.0);


Yes, I will probably have to live with that for now - I will come back 
to these workarounds with one or two questions.


Thanks again  Regards,
Seefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] serious under-estimation of n_distinct for clustered distributions

2012-12-29 Thread Stefan Andreatta

I have encountered serious under-estimations of distinct values when
values are not evenly distributed but clustered within a column. I think
this problem might be relevant to many real-world use cases and I wonder
if there is a good workaround or possibly a programmatic solution that
could be implemented.

Thanks for your help!
Stefan

*The Long Story:*

When Postgres collects statistics, it estimates the number of distinct
values for every column (see pg_stats.n_distinct). This is one important
source for the planner to determine the selectivity and hence can have
great influence on the resulting query plan.

My Problem:

When I collected statistics on some columns that have rather high
selectivity but not anything like unique values, I consistently got
n_distinct values that are far too low (easily by some orders of
magnitude). Worse still, the estimates did not really improve until I
analyzed the whole table.

I tested this for Postgres 9.1 and 9.2. An artificial test-case is
described at the end of this mail.

Some Analysis of the Problem:

Unfortunately it is not trivial to estimate the total number of
different values based on a sample. As far as I found out, Postgres
uses an algorithm that is based on the number of values that are found
only once in the sample used for ANALYZE. I found references to
Good-Turing frequency estimation
(http://encodestatistics.org/publications/statistics_and_postgres.pdf)
and to a paper from Haas Stokes, Computer Science, 1996
(http://researcher.ibm.com/researcher/files/us-phaas/jasa3rj.pdf). The
latter source is from Josh Berkus in a 2005 discussion on the Postgres
Performance List (see e.g.
http://grokbase.com/t/postgresql/pgsql-performance/054kztf8pf/bad-n-distinct-estimation-hacks-suggested
for a look on the whole discussion there). The formula given there for
the total number of distinct values is:

n*d / (n - f1 + f1*n/N)

where f1 is the number of values that occurred only once in the sample.
n is the number of rows sampled, d the number of distincts found and N
the total number of rows in the table.

Now, the 2005 discussion goes into great detail on the advantages and
disadvantages of this algorithm, particularly when using small sample
sizes, and several alternatives are discussed. I do not know whether
anything has been changed after that, but I know that the very distinct
problem, which I will focus on here, still persists.

When the number of values that are found only once in the sample (f1)
becomes zero, the whole term equals d, that is, n_distinct is estimated
to be just the number of distincts found in the sample.

This is basically fine as it should only happen when the sample has
really covered more or less all distinct values. However, we have a
sampling problem here: for maximum efficiency Postgres samples not
random rows but random pages. If the distribution of the values is not
random but clustered (that is, the same values tend to be close
together) we run into problems. The probability that any value from a
clustered distribution is sampled only once, when any page covers
multiple adjacent rows, is very low.

So, under these circumstances, the estimate for n_distinct will always
be close to the number of distincts found in the sample. Even if every
value would in fact only appear a few times in the table.

Relevance:

I think this is not just an unfortunate border case, but a very common
situation. Imagine two tables that are filled continually over time
where the second table references the first - some objects and multiple
properties for each for example. Now the foreign key column of the
second table will have many distinct values but a highly clustered
distribution. It is probably not helpful, if the planner significantly
underestimates the high selectivity of the foreign key column.

Workarounds:

There are workarounds: manually setting table column statistics or using
an extremely high statistics target, so that the whole table gets
analyzed. However, these workarounds do not seem elegant and may be
impractical.

Questions:

A) Did I find the correct reason for my problem? Specifically, does
Postgres really estimate n_distinct as described above?

B) Are there any elegant workarounds?

C) What could be a programmatic solution to this problem? I think, it
might be possible to use the number of values that are found in only one
page (vs. found only once at all) for f1. Or the number of distincts
could be calculated using some completely different approach?

Test Case:

For an artificial test-case let's create a table and fill it with 10
million rows (appr. 1,300 MB required). There is an ID column featuring
unique values and 4 groups of 3 columns each that have selectivities of:

- 5 (x_2000k = 2,000,000 distinct values)
- 25 (x_400k = 400,000 distinct values)
- 125 (x_80k = 80,000 distinct values).

The 4 groups of columns show different distributions

Re: [PERFORM] Index over all partitions (aka global index)?

2012-10-14 Thread Stefan Keller

Yes a physical index would be one solution - but it's not the only one.

The indexes could be treated in parallel in their physical places
where they are. That's why I called it still logical.

I don't think so that I would loose all benefits of partition since an
index could adapt itself when partitions are attached or removed.
That's probably how Oracle resolves it which knows global indexes
probably since version 8(!) [1]

Yours, S.

[1] http://www.oracle-base.com/articles/8i/partitioned-tables-and-indexes.php


2012/10/14 Jeff Janes jeff.ja...@gmail.com:
 On Sat, Oct 13, 2012 at 5:43 PM, Stefan Keller sfkel...@gmail.com wrote:

 Say, there is a table with 250 mio. rows split into 250 tables with 1
 mio. rows each. And say the the index behavior is O(log n). Then a
 search for a key takes O(log(250*n)) or 8.4 time units. What PG (9.1)
 currently probably does is a iterative call to all 250 partitioned
 tables, which will take O(250*log(n)) - or 1500 time units in this
 case. This is about 180 times slower.

 What do you think about introducing a global index over all
 partitions (like Ora :-)? This would be a (logically) single index
 which can be even be parallelized given the partitioned tables are
 optimally distributed like in different tablespaces.

 What do you think about this?

 What you already have is a logically single index.  What you want is
 physically single index.  But wouldn't that remove most of the
 benefits of partitioning?  You could no longer add or remove
 partitions instantaneously, for example.

 Cheers,

 Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Index over all partitions (aka global index)?

2012-10-13 Thread Stefan Keller

Hi,

Given I have a large table implemented with partitions and need fast
access to a (primary) key value in a scenario where every minute
updates (inserts/updates/deletes) are coming in.

Now since PG does not allow any index (nor constraint) on master
table, I have a performance issue (and a possible parallelization
opportunity).

Say, there is a table with 250 mio. rows split into 250 tables with 1
mio. rows each. And say the the index behavior is O(log n). Then a
search for a key takes O(log(250*n)) or 8.4 time units. What PG (9.1)
currently probably does is a iterative call to all 250 partitioned
tables, which will take O(250*log(n)) - or 1500 time units in this
case. This is about 180 times slower.

What do you think about introducing a global index over all
partitions (like Ora :-)? This would be a (logically) single index
which can be even be parallelized given the partitioned tables are
optimally distributed like in different tablespaces.

What do you think about this?

-S.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Inserts in 'big' table slowing down the database

2012-10-01 Thread Stefan Keller

Sorry for the delay. I had to sort out the problem (among other things).

It's mainly about swapping.

The table nodes contains about 2^31 entries and occupies about 80GB on
disk space plus index.
If one would store the geom values in a big array (where id is the
array index) it would only make up about 16GB, which means that the
ids are dense (with few deletes).
Then updates come in every hour as bulk insert statements with entries
having ids in sorted manner.
Now PG becomes slower and slower!
CLUSTER could help - but obviously this operation needs a table lock.
And if this operation takes longer than an hour, it delays the next
update.

Any ideas? Partitioning?

Yours, S.

2012/9/3 Ivan Voras ivo...@freebsd.org:
 On 03/09/2012 13:03, Stefan Keller wrote:
 Hi,

 I'm having performance issues with a simple table containing 'Nodes'
 (points) from OpenStreetMap:

   CREATE TABLE nodes (
   id bigint PRIMARY KEY,
   user_name text NOT NULL,
   tstamp timestamp without time zone NOT NULL,
   geom GEOMETRY(POINT, 4326)
   );
   CREATE INDEX idx_nodes_geom ON nodes USING gist (geom);

 The number of rows grows steadily and soon reaches one billion
 (1'000'000'000), therefore the bigint id.
 Now, hourly inserts (update and deletes) are slowing down the database
 (PostgreSQL 9.1) constantly.
 Before I'm looking at non-durable settings [1] I'd like to know what
 choices I have to tune it while keeping the database productive:
 cluster index? partition table? use tablespaces? reduce physical block size?

 You need to describe in detail what does slowing down mean in your
 case. Do the disk drives somehow do more operations per transaction?
 Does the database use more CPU cycles? Is there swapping? What is the
 expected (previous) performance?

 At a guess, it is very unlikely that using non-durable settings will
 help you here.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Inserts in 'big' table slowing down the database

2012-09-03 Thread Stefan Keller

Hi,

I'm having performance issues with a simple table containing 'Nodes'
(points) from OpenStreetMap:

  CREATE TABLE nodes (
  id bigint PRIMARY KEY,
  user_name text NOT NULL,
  tstamp timestamp without time zone NOT NULL,
  geom GEOMETRY(POINT, 4326)
  );
  CREATE INDEX idx_nodes_geom ON nodes USING gist (geom);

The number of rows grows steadily and soon reaches one billion
(1'000'000'000), therefore the bigint id.
Now, hourly inserts (update and deletes) are slowing down the database
(PostgreSQL 9.1) constantly.
Before I'm looking at non-durable settings [1] I'd like to know what
choices I have to tune it while keeping the database productive:
cluster index? partition table? use tablespaces? reduce physical block size?

Stefan

[1] http://www.postgresql.org/docs/9.1/static/non-durability.html


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow query: Select all buildings that have 1 pharmacies and 1 schools within 1000m

2012-08-09 Thread Stefan Keller

Hi

2012/8/8 Jeff Janes jeff.ja...@gmail.com:
 On Tue, Aug 7, 2012 at 5:07 PM, Stefan Keller sfkel...@gmail.com wrote:
 Hi Craig

 Clever proposal!
 I slightly tried to adapt it to the hstore involved.
 Now I'm having a weird problem that PG says that relation 'p' does not 
 exist.
 Why does PG recognize table b in the subquery but not table p?
 Any ideas?

 I don't think it does recognize b, either.  It just fell over on p
 before it had a chance to fall over on b.

No, the b get's recognized. See my original query.
That's a strange behaviour of the SQL parser which I can't understand.

 I think you have to use WITH if you want to reference the same
 subquery in multiple FROMs.

I'll try that with CTE too.

 Another approach would be to add explicit conditions for there being
 at least 1 school and 1 pharmacy within distance.  There can't be 1
 unless there is =1, but the join possibilities for =1 (i.e. where
 exists rather than where (select count(*)...)1 )  are much more
 attractive than the ones for 1.

 Cheers,

 Jeff

You mean, first doing a select on existence and then apply the count
condition later?

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Slow query: Select all buildings that have 1 pharmacies and 1 schools within 1000m

2012-08-07 Thread Stefan Keller

Hi

I have an interesting query to be optimized related to this one [1].

The query definition is: Select all buildings that have more than 1
pharmacies and more than 1 schools within a radius of 1000m.

The problem is that I think that this query is inherently O(n^2). In
fact the solution I propose below takes forever...

My questions:

1. Any comments about the nature of this problem?

2. ... on how to speed it up ?

3. In the original query [1] there's a count which contains a
subquery. According to my tests PostgreSQL does not allow this despite
the documentation which says count(expression).

Remarks: I know that count(*) could be faster on PostgreSQL but
count(osm_id) does not change the query plan and this does not seem
to be the bottleneck here anyway.

Yours, S.

[1] 
http://gis.stackexchange.com/questions/11445/selecting-pois-around-specific-buildings-using-postgis


Here's my query:

-- Select all buildings that have 1 pharmacies and 1 schools within 1000m:
SELECT osm_id AS building_id
FROM
  (SELECT osm_id, way
   FROM osm_polygon
   WHERE tags @ hstore('building','yes')
  ) AS b
WHERE
 (SELECT count(*)  1
  FROM osm_poi AS p
  WHERE p.tags @ hstore('amenity','pharmacy')
  AND ST_DWithin(b.way,p.way,1000)
 )
 AND
 (SELECT count(*)  1
  FROM osm_poi AS p
  WHERE p.tags @ hstore('amenity','school')
  AND ST_DWithin(b.way,p.way,1000)
 )
-- Total query runtime: 4308488 ms. 66345 rows retrieved.

Here's the query plan (from EXPLAIN):
Index Scan using osm_polygon_tags_idx on osm_polygon
(cost=0.00..406812.81 rows=188 width=901)
  Index Cond: (tags @ 'building=yes'::hstore)
  Filter: ((SubPlan 1) AND (SubPlan 2))
  SubPlan 1
-  Aggregate  (cost=269.19..269.20 rows=1 width=0)
  -  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
rows=1 width=0)
Recheck Cond: (way  st_expand(osm_polygon.way,
1000::double precision))
Filter: ((tags @ 'amenity=pharmacy'::hstore)
AND (osm_polygon.way  st_expand(way, 1000::double precision)) AND
_st_dwithin(osm_polygon.way, way, 1000::double precision))
-  Bitmap Index Scan on osm_poi_way_idx
(cost=0.00..7.76 rows=62 width=0)
  Index Cond: (way  st_expand(osm_polygon.way,
1000::double precision))
  SubPlan 2
-  Aggregate  (cost=269.19..269.20 rows=1 width=0)
  -  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
rows=1 width=0)
Recheck Cond: (way  st_expand(osm_polygon.way,
1000::double precision))
Filter: ((tags @ 'amenity=school'::hstore) AND
(osm_polygon.way  st_expand(way, 1000::double precision)) AND
_st_dwithin(osm_polygon.way, way, 1000::double precision))
-  Bitmap Index Scan on osm_poi_way_idx
(cost=0.00..7.76 rows=62 width=0)
  Index Cond: (way  st_expand(osm_polygon.way,
1000::double precision))

***

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow query: Select all buildings that have 1 pharmacies and 1 schools within 1000m

2012-08-07 Thread Stefan Keller

Your proposal lacks the requirement that it's the same building from
where pharmacies and schools are reachable.
But I think about.

Yours, S.

2012/8/7 Tomas Vondra t...@fuzzy.cz:
 On 7 Srpen 2012, 14:01, Stefan Keller wrote:
 Hi

 I have an interesting query to be optimized related to this one [1].

 The query definition is: Select all buildings that have more than 1
 pharmacies and more than 1 schools within a radius of 1000m.

 The problem is that I think that this query is inherently O(n^2). In
 fact the solution I propose below takes forever...

 What about plain INTERSECT? Something like

 SELECT osm_id FROM osm_poi AS p, osm_polygon b
WHERE p.tags @ hstore('amenity','pharmacy')
AND ST_DWithin(b.way,p.way,1000)
 INTERSECT
 SELECT osm_id FROM osm_poi AS p, osm_polygon b
WHERE p.tags @ hstore('amenity','school')
AND ST_DWithin(b.way,p.way,1000)

 Or something like that. But maybe it's a complete nonsense ...

 Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow query: Select all buildings that have 1 pharmacies and 1 schools within 1000m

2012-08-07 Thread Stefan Keller

Hi Craig

Clever proposal!
I slightly tried to adapt it to the hstore involved.
Now I'm having a weird problem that PG says that relation 'p' does not exist.
Why does PG recognize table b in the subquery but not table p?
Any ideas?

-- Stefan


SELECT b.way AS building_geometry
FROM
 (SELECT way
  FROM osm_polygon
  WHERE tags @ hstore('building','yes')
 ) AS b,
 (SELECT way, tags-'amenity' as value
  FROM osm_poi
  WHERE tags ? 'amenity'
 ) AS p
WHERE
 (SELECT count(*)  1
  FROM p
  WHERE p.value = 'pharmacy'
  AND ST_DWithin(b.way,p.way,1000)
 )
 AND
 (SELECT count(*)  1
  FROM p
  WHERE p.value = 'school'
  AND ST_DWithin(b.way,p.way,1000)
 )

ERROR:  relation p does not exist
LINE 14:   FROM p


2012/8/7 Craig James cja...@emolecules.com:
 On Tue, Aug 7, 2012 at 5:01 AM, Stefan Keller sfkel...@gmail.com wrote:

 Hi

 I have an interesting query to be optimized related to this one [1].

 The query definition is: Select all buildings that have more than 1
 pharmacies and more than 1 schools within a radius of 1000m.

 The problem is that I think that this query is inherently O(n^2). In
 fact the solution I propose below takes forever...


 Maybe you could get rid of the O(n^2) aspect like this:


Select all buildings that have more than 1
pharmacies and more than 1 schools within a radius of 1000m
from
   (Select all buildings that have more than four (pharmacy or school)
 within a radius of 1000m)

 The inner select should be fast -- you could make it fast by creating a new
 property like building of interest that was pharmacy or school and build
 an index on the building of interest property.

 The inner query would reduce your sample set to a much smaller set of
 buildings, and presumably the outer query could handle that pretty quickly.

 Craig James



 My questions:

 1. Any comments about the nature of this problem?

 2. ... on how to speed it up ?

 3. In the original query [1] there's a count which contains a
 subquery. According to my tests PostgreSQL does not allow this despite
 the documentation which says count(expression).

 Remarks: I know that count(*) could be faster on PostgreSQL but
 count(osm_id) does not change the query plan and this does not seem
 to be the bottleneck here anyway.

 Yours, S.

 [1]
 http://gis.stackexchange.com/questions/11445/selecting-pois-around-specific-buildings-using-postgis


 Here's my query:

 -- Select all buildings that have 1 pharmacies and 1 schools within
 1000m:
 SELECT osm_id AS building_id
 FROM
   (SELECT osm_id, way
FROM osm_polygon
WHERE tags @ hstore('building','yes')
   ) AS b
 WHERE
  (SELECT count(*)  1
   FROM osm_poi AS p
   WHERE p.tags @ hstore('amenity','pharmacy')
   AND ST_DWithin(b.way,p.way,1000)
  )
  AND
  (SELECT count(*)  1
   FROM osm_poi AS p
   WHERE p.tags @ hstore('amenity','school')
   AND ST_DWithin(b.way,p.way,1000)
  )
 -- Total query runtime: 4308488 ms. 66345 rows retrieved.

 Here's the query plan (from EXPLAIN):
 Index Scan using osm_polygon_tags_idx on osm_polygon
 (cost=0.00..406812.81 rows=188 width=901)
   Index Cond: (tags @ 'building=yes'::hstore)
   Filter: ((SubPlan 1) AND (SubPlan 2))
   SubPlan 1
 -  Aggregate  (cost=269.19..269.20 rows=1 width=0)
   -  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
 rows=1 width=0)
 Recheck Cond: (way  st_expand(osm_polygon.way,
 1000::double precision))
 Filter: ((tags @ 'amenity=pharmacy'::hstore)
 AND (osm_polygon.way  st_expand(way, 1000::double precision)) AND
 _st_dwithin(osm_polygon.way, way, 1000::double precision))
 -  Bitmap Index Scan on osm_poi_way_idx
 (cost=0.00..7.76 rows=62 width=0)
   Index Cond: (way  st_expand(osm_polygon.way,
 1000::double precision))
   SubPlan 2
 -  Aggregate  (cost=269.19..269.20 rows=1 width=0)
   -  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
 rows=1 width=0)
 Recheck Cond: (way  st_expand(osm_polygon.way,
 1000::double precision))
 Filter: ((tags @ 'amenity=school'::hstore) AND
 (osm_polygon.way  st_expand(way, 1000::double precision)) AND
 _st_dwithin(osm_polygon.way, way, 1000::double precision))
 -  Bitmap Index Scan on osm_poi_way_idx
 (cost=0.00..7.76 rows=62 width=0)
   Index Cond: (way  st_expand(osm_polygon.way,
 1000::double precision))

 ***

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-03-01 Thread Stefan Keller

2012/3/1 Jeff Janes jeff.ja...@gmail.com:
 On Tue, Feb 28, 2012 at 3:46 PM, Stefan Keller sfkel...@gmail.com wrote:
 2012/2/28 Claudio Freire klaussfre...@gmail.com:

 In the OP, you say There is enough main memory to hold all table
 contents.. I'm assuming, there you refer to your current system, with
 4GB memory.

 Sorry for the confusion: I'm doing these tests on this machine with
 one table (osm_point) and one country. This table has a size of 2.6GB
 and 10 million tuples. The other machine has to deal with at least 5
 tables in total and will be hold more than one country plus routing
 etc..

 What is your shared_buffers set to?  2.6GB is uncomfortably close to
 4GB, considering the computer has other things it needs to use memory
 for as well.

These are the current modified settings in postgresql.conf:
shared_buffers = 128MB
work_mem = 3MB
maintenance_work_mem = 30MB
effective_cache_size = 352MB
wal_buffers = 8MB
default_statistics_target = 50
constraint_exclusion = on
checkpoint_completion_target = 0.9
checkpoint_segments = 16
max_connections = 80

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-29 Thread Stefan Keller

2012/2/29 Jeff Janes jeff.ja...@gmail.com:
 It's quite possible the vacuum full is thrashing your disk cache due
 to maintainance_work_mem. You can overcome this issue with the tar
 trick, which is more easily performed as:

 tar cf /dev/null $PG_DATA/base

 But on many implementations, that will not work.  tar detects the
 output is going to the bit bucket, and so doesn't bother to actually
 read the data.

Right.
But what about the commands cp $PG_DATA/base /dev/null or cat
$PG_DATA/base  /dev/null ?
They seem to do something.

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-29 Thread Stefan Keller

2012/2/29 Stefan Keller sfkel...@gmail.com:
 2012/2/29 Jeff Janes jeff.ja...@gmail.com:
 It's quite possible the vacuum full is thrashing your disk cache due
 to maintainance_work_mem. You can overcome this issue with the tar
 trick, which is more easily performed as:

 tar cf /dev/null $PG_DATA/base

 But on many implementations, that will not work.  tar detects the
 output is going to the bit bucket, and so doesn't bother to actually
 read the data.

 Right.
 But what about the commands cp $PG_DATA/base /dev/null or cat
 $PG_DATA/base  /dev/null ?
 They seem to do something.

...or let's try /dev/zero instead /dev/null:
tar cf /dev/zero $PG_DATA/base

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Re: 回复： [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Stefan Keller

Hi Wales

2012/2/27 Wales Wang wormw...@yahoo.com wrote:
 There are many approach for PostgreSQL in-memory.
 The quick and easy way is making slave pgsql run on persistent RAM
 filesystem, the slave is part of master/slave replication cluster.

 The fstab and script make RAM file system persistent is below:
 Setup:
 First, create a mountpoint for the disk :
 mkdir /mnt/ramdisk
 Secondly, add this line to /etc/fstab in to mount the drive at boot-time.
 tmpfs   /mnt/ramdisk tmpfs  defaults,size=65536M 0 0
 #! /bin/sh
 # /etc/init.d/ramdisk.sh
 #

 case $1 in
   start)
     echo Copying files to ramdisk
     rsync -av /data/ramdisk-backup/ /mnt/ramdisk/
     echo [`date +%Y-%m-%d %H:%M`] Ramdisk Synched from HD 
 /var/log/ramdisk_sync.log
     ;;
   sync)
     echo Synching files from ramdisk to Harddisk
     echo [`date +%Y-%m-%d %H:%M`] Ramdisk Synched to HD 
 /var/log/ramdisk_sync.log
     rsync -av --delete --recursive --force /mnt/ramdisk/
 /data/ramdisk-backup/
     ;;
   stop)
     echo Synching logfiles from ramdisk to Harddisk
     echo [`date +%Y-%m-%d %H:%M`] Ramdisk Synched to HD 
 /var/log/ramdisk_sync.log
     rsync -av --delete --recursive --force /mnt/ramdisk/
 /data/ramdisk-backup/
     ;;
   *)
     echo Usage: /etc/init.d/ramdisk {start|stop|sync}
     exit 1
     ;;
 esac
 exit 0

 you can run it when startup and shutdown and crontabe hoursly.

 Wales Wang

Thank you for the tipp.
Making slave pgsql run on persistent RAM filesystem is surely at least
a possibility which I'll try out.

But what I'm finally after is a solution, where records don't get
pushed back to disk a.s.a.p. but rather got hold in memory as long as
possible assuming that there is enough memory.
I suspect that currently there is quite some overhead because of that
(besides disk-oriented structures).

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Stefan Keller

2012/2/28 Claudio Freire klaussfre...@gmail.com:
 On Tue, Feb 28, 2012 at 5:48 PM, Stefan Keller sfkel...@gmail.com wrote:
 P.S. And yes, the database is aka 'read-only' and truncated and
 re-populated from scratch every night. fsync is off so I don't care
 about ACID. After the indexes on name, hstore and geometry are
 generated I do a VACUUM FULL FREEZE. The current installation is a
 virtual machine with 4GB memory and the filesystem is read/write.
 The future machine will be a pizza box with 72GB memory.

 I don't get this. Something's wrong.

 In the OP, you say There is enough main memory to hold all table
 contents.. I'm assuming, there you refer to your current system, with
 4GB memory.

Sorry for the confusion: I'm doing these tests on this machine with
one table (osm_point) and one country. This table has a size of 2.6GB
and 10 million tuples. The other machine has to deal with at least 5
tables in total and will be hold more than one country plus routing
etc..

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-26 Thread Stefan Keller

Hi Jeff and Wales,

2012/2/26 Jeff Janes jeff.ja...@gmail.com wrote:
 The problem is that the initial queries are too slow - and there is no
 second chance. I do have to trash the buffer every night. There is
 enough main memory to hold all table contents.

 Just that table, or the entire database?

The entire database consisting of only about 5 tables which are
similar but with different geometry types plus a relations table (as
OpenStreetMap calls it).

 1. How can I warm up or re-populate shared buffers of Postgres?

 Instead, warm the OS cache.  Then data will get transferred into the
 postgres shared_buffers pool from the OS cache very quickly.

 tar -c $PGDATA/base/ |wc -c

Ok. So with OS cache you mean the files which to me are THE database itself?
A cache to me is a second storage with controlled redudancy because
of performance reasons.

 2. Are there any hints on how to tell Postgres to read in all table
 contents into memory?

 I don't think so, at least not in core.  I've wondered if it would
 make sense to suppress ring-buffer strategy when there are buffers on
 the free-list.  That way a sequential scan would populate
 shared_buffers after a restart.  But it wouldn't help you get the
 indexes into cache.

So, are there any developments going on with PostgreSQL as Stephen
suggested in the former thread?

2012/2/26 Wales Wang wormw...@yahoo.com:
 You can try PostgreSQL 9.x master/slave replication, then try run slave
 on persistent RAM Fileystem (tmpfs)
 So, access your all data from slave PostgreSQL that run on tmpfs..

Nice idea.
I do have a single upscaled server and up to now I hesitated to
allocate say 48 Gigabytes (out of 72) to such a RAM Fileystem (tmpfs).

Still, would'nt it be more flexible when I could dynamically instruct
PostgreSQL to behave like an in-memory database?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-26 Thread Stefan Keller

2012/2/26 Andy Colson a...@squeakycode.net wrote:
 On 02/25/2012 06:16 PM, Stefan Keller wrote:
 1. How can I warm up or re-populate shared buffers of Postgres?
 2. Are there any hints on how to tell Postgres to read in all table
 contents into memory?

 Yours, Stefan

 How about after you load the data, vacuum freeze it, then do something like:

 SELECT count(*) FROM osm_point WHERE tags @ 'tourism=junk'

 -Andy

That good idea is what I proposed elsewhere on one of the PG lists and
got told that this does'nt help.

I can accept this approach that users should'nt directly interfere
with the optimizer. But I think it's still worth to discuss a
configuration option (per table) or so which tells PG that this table
contents should fit into memory so that it tries to load a table into
memory and keeps it there. This option probably only makes sense in
combination with unlogged tables.

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-26 Thread Stefan Keller

Hi,

2012/2/26 Cédric Villemain ced...@2ndquadrant.fr wrote:
 1. How can I warm up or re-populate shared buffers of Postgres?

 There was a patch proposed for postgresql which purpose was to

Which patch are you referring to?

 snapshot/Restore postgresql buffers, but it is still not sure how far that
 really help to have that part loaded.

What's not sure and why?

 2. Are there any hints on how to tell Postgres to read in all table
 contents into memory?

 I wrote pgfincore for the OS part: you can use it to preload table/index in OS
 cache, and do snapshot/restore if you want fine grain control of what part of
 the object you want to warm.
 https://github.com/klando/pgfincore

Yes, now I remember. I have a look at that.

I'd still like to see something where PG really preloads tuples and
treats them always in-memory (given they fit into RAM).
Since I have a read-only database there's no WAL and locking needed.
But as soon as we allow writes I realize that the in-memory feature
needs to be coupled with other enhancements like replication (which
somehow would avoid WAL).

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-25 Thread Stefan Keller

Hi,

2011/10/24 Stephen Frost sfr...@snowman.net wrote
 Now, we've also been discussing ways to have PG automatically
 re-populate shared buffers and possibly OS cache based on what was in
 memory at the time of the last shut-down, but I'm not sure that would
 help your case either since you're rebuilding everything every night and
 that's what's trashing your buffers (because everything ends up getting
 moved around).  You might actually want to consider if that's doing more
 harm than good for you.  If you weren't doing that, then the cache
 wouldn't be getting destroyed every night..

I'd like to come back on the issue of aka of in-memory key-value database.

To remember, it contains table definition and queries as indicated in
the appendix [0]. There exist 4 other tables of similar structure.
There are indexes on each column. The tables contain around 10 million
tuples. The database is read-only; it's completely updated every
day. I don't expect more than 5 concurrent users at any time. A
typical query looks like [1] and varies in an unforeseable way (that's
why hstore is used). EXPLAIN tells me that the indexes are used [2].

The problem is that the initial queries are too slow - and there is no
second chance. I do have to trash the buffer every night. There is
enough main memory to hold all table contents.

1. How can I warm up or re-populate shared buffers of Postgres?
2. Are there any hints on how to tell Postgres to read in all table
contents into memory?

Yours, Stefan


APPENDIX

[0]
CREATE TABLE osm_point (
 osm_id integer,
 name text,
 tags hstore
 geom geometry(Point,4326)
);


[1]
SELECT osm_id, name FROM osm_point
  WHERE tags @ 'tourism=viewpoint'
  AND ST_Contains(
GeomFromText('BOX(8.42 47.072, 9.088 47.431)'::box2d, 4326),
geom)

[2]
EXPLAIN ANALYZE returns:
 Bitmap Heap Scan on osm_point  (cost=402.15..40465.85 rows=430
width=218) (actual time=121.888..137.
   Recheck Cond: (tags @ 'tourism=viewpoint'::hstore)
   Filter: (('01030...'::geometry  geom) AND
_st_contains('01030'::geometry, geom))
   -  Bitmap Index Scan on osm_point_tags_idx  (cost=0.00..402.04
rows=11557 width=0) (actual time=1 6710 loops=1)
 Index Cond: (tags @ 'tourism=viewpoint'::hstore)
 Total runtime: 137.881 ms
(6 rows)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] hstore query: Any better idea than adding more memory?

2011-10-23 Thread Stefan Keller

Hi Stephen

Thanks for your answer and hints.

2011/10/24 Stephen Frost sfr...@snowman.net wrote:
 * Stefan Keller (sfkel...@gmail.com) wrote:
 Adding more memory (say to total of 32 GB) would only postpone the problem.
 Erm, seems like you're jumping to conclusions here...

Sorry. I actually only wanted to report here what's special in my
postgresql.conf.

 First time the query lasts about 10 time longer (~ 1010 ms) - but I'd
 like to get better results already in the first query.

 Do you mean first time after a database restart?

No: I simply meant doing the query when one can assume that the query
result is not yet in the postgres' cache.
You can check that here online: http://labs.geometa.info/postgisterminal

 = 1. When I add the actual time from EXPLAIN above, I get 11 + 10 +
 10ms which is three times greater than the 11ms reported.  Why?

 Because they include the times from the nodes under them.

 = 2. Why does the planner choose to sort first instead of sorting the
 (smaller)  result query at the end the?

 You're reading the explain 'backwards' regarding time..  It *does* do
 the sort last.  Nodes which are indented feed the nodes above them, so
 the bitmap index scan and recheck feed into the sort, hence the sort is
 actually done after.  Can't really work any other way anyway, PG has to
 get the data before it can sort it..

Oh, thanks. I should have realized that.

But then what should the arrow (-) wants to stand for?
Sort (cost=30819.51...
  -  Bitmap Heap Scan on osm_point  (cost=313.21...
  -  Bitmap Index Scan on osm_point_tags_idx

I would suggest that the inverse arrow would be more intuitive:
Sort (cost=30819.51...
  -  Bitmap Heap Scan on osm_point  (cost=313.21...
  -  Bitmap Index Scan on osm_point_tags_idx

 = 3. What could I do to speed up such queries (first time, i.e.
 without caching) besides simply adding more memory?

 There didn't look like anything there that could really be done much
 faster, at the plan level.  It's not uncommon for people to
 intentionally get a box with more memory than the size of their
 database, so everything is in memory.

 At the end of the day, if the blocks aren't in memory then PG has to get
 them from disk.  If disk is slow, the query is going to be slow.  Now,
 hopefully, you're hitting this table often enough with similar queries
 that important, common, parts of the table and index are already in
 memory, but there's no magic PG can perform to ensure that.

 If there's a lot of updates/changes to this table, you might check if
 there's a lot of bloat (check_postgres works great for this..).
 Eliminating excessive bloat, if there is any, could help with all
 accesses to that table, of course, since it would reduce the amount of
 data which would need to be.

Thanks for the hint.

But there are only periodic updates (currently once a night) and these
are actually done by 1. truncating the database and 2. bulk loading
all the stuff, then 3. reindexing.

If one tries to completely fit the whole data into memory, then to me
PostgreSQL features borrowed from in-memory databases become
interesting.

= Is there anything else than index-only scans (planned for 9.2?)
which could be of interest here?

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] hstore query: Any better idea than adding more memory?

2011-10-22 Thread Stefan Keller

Hi all

I'd like to tune the following hstore-related query which selects all
Zoos from table osm_point:

SELECT osm_id, name, tags
  FROM osm_point
  WHERE tags @ hstore('tourism','zoo')
  ORDER BY name;

... given the following table and indexes definition:

CREATE TABLE osm_point (
  osm_id integer,
  name text,
  tags hstore,
  way geometry
)

CREATE INDEX osm_point_index ON osm_point USING gist (way);
CREATE INDEX osm_point_name_idx ON osm_point USING btree  (name) WITH
(FILLFACTOR=100);
ALTER TABLE osm_point CLUSTER ON osm_point_name_idx;
CREATE INDEX osm_point_pkey ON osm_point USING btree (osm_id);
CREATE INDEX osm_point_tags_idx ON osm_point USING gist (tags) WITH
(FILLFACTOR=100);

... and following statistics:
* Live Tuples  9626138  (that's also what COUNT(*) returns)
* Table Size1029 MB
* Toast Table Size  32 kB
* Indexes Size1381 MB (?)
** osm_point_index 1029 MB
** osm_point_name_idx   1029 MB
** osm_point_pkey  1029 MB
** osm_point_tags_idx 1029 MB

PostgreSQL has version 9.0.4, runs on on Ubuntu Linux 10.04 LTS
(64-Bit) with 1 vCPU and 1 GB vRAM.
Adding more memory (say to total of 32 GB) would only postpone the problem.
I already increased the PostgreSQL configuration of shared_buffers
(using pgtune).

Now EXPLAIN ANALYZE returns (if run several times):
Sort  (cost=30819.51..30843.58 rows=9626 width=65) (actual
time=11.502..11.502 rows=19 loops=1)
  Sort Key: name
  Sort Method:  quicksort  Memory: 29kB
  -  Bitmap Heap Scan on osm_point  (cost=313.21..30182.62 rows=9626
width=65) (actual time=10.727..11.473 rows=19 loops=1)
Recheck Cond: (tags @ 'tourism=zoo'::hstore)
-  Bitmap Index Scan on osm_point_tags_idx
(cost=0.00..310.80 rows=9626 width=0) (actual time=10.399..10.399
rows=591 loops=1)
  Index Cond: (tags @ 'tourism=zoo'::hstore)
Total runtime: 11 ms

First time the query lasts about 10 time longer (~ 1010 ms) - but I'd
like to get better results already in the first query.

= 1. When I add the actual time from EXPLAIN above, I get 11 + 10 +
10ms which is three times greater than the 11ms reported.  Why?
= 2. Why does the planner choose to sort first instead of sorting the
(smaller)  result query at the end the?
= 3. What could I do to speed up such queries (first time, i.e.
without caching) besides simply adding more memory?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-18 Thread Stefan Keller

Merlin and Jeff,

General remark again:It's hard for me to imagine that btree is
superior for all the issues mentioned before. I still believe in hash
index for primary keys and certain unique constraints where you need
equality search and don't need ordering or range search.

2011/9/17 Jeff Janes jeff.ja...@gmail.com:
(...)
 Also, that link doesn't address concurrency of selects at all, only of 
 inserts.

How would (or did) you test and benchmark concurrency of inserts and selects?
Use pgbench with own config for a blackbox test?

2011/9/18 Merlin Moncure mmonc...@gmail.com:
 Here again, I think that any proposed improvement in the current hash
 index code should be measured against wrapping a btree index.   You
 get wal logging and high concurrency for free if you decide to do
 that.

As I understand, this would be an enhancement of btree. That's ok for
btree but not really exploiting all advantages of a separate hash
index, would'nt it?

Stefan

2011/9/18 Merlin Moncure mmonc...@gmail.com:
 On Sat, Sep 17, 2011 at 4:48 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote:
 Has this been verified on a recent release? I can't believe that hash
 performs so bad over all these points. Theory tells me otherwise and
 http://en.wikipedia.org/wiki/Hash_table seems to be a success.

 My understanding is that a huge amount of work has gone into making
 btree what it is in
 PG, and not nearly as much work has gone into making hash indexes what
 they could be.


 Hash indexes have been improved since 2005 - their performance was
 improved quite a bit in 9.0. Here's a more recent analysis:

 http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

 They are 3 time faster to build.  But if you rip the WAL logging out
 of btree, how much faster would those get?

 Also, that link doesn't address concurrency of selects at all, only of 
 inserts.

 Of course hash indexes are faster to build than varlen string indexes
 :-).  I use natural keys 50-80% of the time and hash indexing would
 remove some of the pain in cases where I don't need ordering and range
 operations. In fact, if they are made to properly support wal logging
 and uniqueness, I imagine they should supplant btree in a broad range
 of cases, so much so that it would be awful nice to be able to have
 syntax to choose hash for primary keys and unique constraints.

 @ Jeff:
I think that adding WAL to hash indexes without first
 addressing the heavy-weight locking issue would be a mistake.
 Even if the WAL was fixed, the bad performance under
 concurrent selects would still make it at best a narrow
 niche thing.  And fixing the locking *after* WAL is in place would
 probably be very much harder than the other order.

 Here again, I think that any proposed improvement in the current hash
 index code should be measured against wrapping a btree index.   You
 get wal logging and high concurrency for free if you decide to do
 that.

 merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-18 Thread Stefan Keller

I'm simply referring to literature (like the intro Ramakrishnan  Gehrke).
I just know that Oracle an Mysql actually do have them too and use it
without those current implementation specific restrictions in
Postgres.

IMHO by design Hash Index (e.g. linear hashing) work best when:
1. only equal (=) tests are used (on whole values)
2. columns (key values) have very-high cardinality

And ideally but not necessarily when index values do not change and
number of rows are known ahead of time (avoiding O(N) worst case - but
there are approaches to chaining with dynamic resizing).

I just collected this to encourage ourselves that enhancing hash
indexes could be worthwhile.

Stefan

2011/9/18 Kevin Grittner kevin.gritt...@wicourts.gov:
 Stefan Keller  wrote:

 It's hard for me to imagine that btree is superior for all the
 issues mentioned before.

 It would be great if you could show a benchmark technique which shows
 otherwise.

 -Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] What about implementing a bitmap index? Any use cases?

2011-09-18 Thread Stefan Keller

Hi,

I know there exist Bitmap Index Scan and Bitmap Heap Scan in Postgres.
What about implementing a bitmap index for explicit use (CREATE INDEX ...)?
Any use cases?
Bitmap indexes work best on values with low cardinality (categorical
data), would be efficient in space and ready for logic operations.

Stefan

P.S. Disclaimer (referring to my other thread about Hash): I'm not a
btree opposer :- I'm just evaluating index alternatives.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Index containing records instead of pointers to the data?

2011-09-18 Thread Stefan Keller

Hi,

Sorry if this is an odd question:
I assume that Postgres indexes don't store records but only pointers
to the data.
This means, that there is always an additional access needed (real table I/O).
Would an index containing data records make sense?

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-17 Thread Stefan Keller

2011/9/17 Tomas Vondra t...@fuzzy.cz wrote:
(...)
 We've been asked by a local university for PostgreSQL-related topics of
 theses and seminary works

I'm also interested in such proposals or ideas!

Here's some list of topics:
* Adding WAL-support to hash indexes in PostgreSQL (see ex-topic)
* Time in PostgreSQL
* Storing (Weather) Sensor Data in PostgreSQL
* Fast Bulk Data Inserting in PostgreSQL with Unlogged tables (incl.
adding GiST support)
* Performance Tuning of Read-Only a PostgreSQL Database
* Materialized Views in PostgreSQL: Experiments around Jonathan
Gardner's Proposal
* more... ?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-16 Thread Stefan Keller

2011/9/16 Tom Lane t...@sss.pgh.pa.us:
 I'm not entirely following this eagerness to junk that AM, anyway.
 We've put a lot of sweat into it over the years, in the hopes that
 it would eventually be good for something.  It's on the edge of
 being good for something now, and there's doubtless room for more
 improvements, so why are the knives out?

No knives from my side. Sorry for the exaggerated subject title.
I'm also in favor for an enhanced hash index for cases where only =
tests are processed and where only few inserts/deletes will occur.

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Stefan Keller

2011/9/14 Tom Lane t...@sss.pgh.pa.us:
 (...) I think that
 the current state of affairs is still what depesz said, namely that
 there might be cases where they'd be a win to use, except the lack of
 WAL support is a killer.  I imagine somebody will step up and do that
 eventually.

Should I open a ticket?

Stefan

2011/9/14 Tom Lane t...@sss.pgh.pa.us:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote:
 Has this been verified on a recent release? I can't believe that hash
 performs so bad over all these points. Theory tells me otherwise and
 http://en.wikipedia.org/wiki/Hash_table seems to be a success.

 Hash indexes have been improved since 2005 - their performance was
 improved quite a bit in 9.0. Here's a more recent analysis:

 http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

 Yeah, looking into the git logs shows several separate major changes
 committed during 2008, including storing only the hash code not the
 whole indexed value (big win on wide values, and lets you index values
 larger than one index page, which doesn't work in btree).  I think that
 the current state of affairs is still what depesz said, namely that
 there might be cases where they'd be a win to use, except the lack of
 WAL support is a killer.  I imagine somebody will step up and do that
 eventually.

 The big picture though is that we're not going to remove hash indexes,
 even if they're nearly useless in themselves, because hash index
 opclasses provide the foundation for the system's knowledge of how to
 do the datatype-specific hashing needed for hash joins and hash
 aggregation.  And those things *are* big wins, even if hash indexes
 themselves never become so.

                        regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Stefan Keller

2011/9/14 Tom Lane t...@sss.pgh.pa.us writes:
 (...) I think that
 the current state of affairs is still what depesz said, namely that
 there might be cases where they'd be a win to use, except the lack of
 WAL support is a killer.  I imagine somebody will step up and do that
 eventually.

How much of work (in man days) do you estimate would this mean for
someone who can program but has to learn PG internals first?

Stefan

2011/9/14 Tom Lane t...@sss.pgh.pa.us:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote:
 Has this been verified on a recent release? I can't believe that hash
 performs so bad over all these points. Theory tells me otherwise and
 http://en.wikipedia.org/wiki/Hash_table seems to be a success.

 Hash indexes have been improved since 2005 - their performance was
 improved quite a bit in 9.0. Here's a more recent analysis:

 http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

 Yeah, looking into the git logs shows several separate major changes
 committed during 2008, including storing only the hash code not the
 whole indexed value (big win on wide values, and lets you index values
 larger than one index page, which doesn't work in btree).  I think that
 the current state of affairs is still what depesz said, namely that
 there might be cases where they'd be a win to use, except the lack of
 WAL support is a killer.  I imagine somebody will step up and do that
 eventually.

 The big picture though is that we're not going to remove hash indexes,
 even if they're nearly useless in themselves, because hash index
 opclasses provide the foundation for the system's knowledge of how to
 do the datatype-specific hashing needed for hash joins and hash
 aggregation.  And those things *are* big wins, even if hash indexes
 themselves never become so.

                        regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Stefan Keller

Interesting debate.

2011/9/13 Marti Raudsepp ma...@juffo.org:
 Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
 SAVEPOINT it can roll back to in case of an error.

Are you sure? In theory I always understood that there are no
subtransactions.

In fact when looking at the docs there is chapter 39.6.6. saying By
default, any error occurring in a PL/pgSQL function aborts execution
of the function, and indeed of the surrounding transaction as well.
You can trap errors and recover from them by using a BEGIN block with
an EXCEPTION clause.
(http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html
)

So the doc isn't totally explicit about this. But whatever: What would
be the the function of a subtransaction? To give the possibility to
recover and continue within the surrounding transaction?

Stefan

2011/9/13 Marti Raudsepp ma...@juffo.org:
 On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com 
 wrote:
 I don't think so.  You only need to catch the error (see attachment).
 Or does this create a sub transaction?

 Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
 SAVEPOINT it can roll back to in case of an error.

 Yes, I mentioned the speed issue.  But regardless of the solution for
 MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
 will have the locking problem anyhow if you plan to insert
 concurrently into the same table and be robust.

 In a mass-loading application you can often divide the work between
 threads in a manner that doesn't cause conflicts.

 For example, if the unique key is foobar_id and you have 4 threads,
 thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
 (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
 work.

 I already suggested this in my original post.

 Regards,
 Marti

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Stefan Keller

The doc at http://www.postgresql.org/docs/current/interactive/indexes-types.html
says: Caution: Hash index operations are not presently WAL-logged, so
hash indexes might need to be rebuilt with REINDEX after a database
crash. They are also not replicated over streaming or file-based
replication. For these reasons, hash index use is presently
discouraged.

I found a thread here
http://archives.postgresql.org/pgsql-general/2005-05/msg00370.php
about Hash index vs. b-tree index (PostgreSQL 8.0) mentioning
some issues, like they
* are not faster than B-trees even for = comparisons
* aren't WAL safe
* have poor concurrency (require coarser locks),
* are significantly slower than creating a b+-tree index.

In fact these statements seem to rely on the docs back in version 7.2
(see http://www.postgresql.org/docs/7.2/static/indexes-types.html )

Has this been verified on a recent release? I can't believe that hash
performs so bad over all these points. Theory tells me otherwise and
http://en.wikipedia.org/wiki/Hash_table seems to be a success.

Are there any plans to give hash index another chance (or to bury it
with a reason)?

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Fwd: [PERFORM] Summaries on SSD usage?

2011-09-06 Thread Stefan Keller

Shaun,

2011/9/2 Shaun Thomas stho...@peak6.com:
 Ironically, this is actually the topic of my presentation at Postgres Open.

Do you think my problem would now be solved with NVRAM PCI card?

Stefan

-- Forwarded message --
From: Stefan Keller sfkel...@gmail.com
Date: 2011/9/3
Subject: Re: [PERFORM] Summaries on SSD usage?
To: Jesper Krogh jes...@krogh.cc
Cc: pgsql-performance@postgresql.org


2011/9/3 Jesper Krogh jes...@krogh.cc:
 On 2011-09-03 00:04, Stefan Keller wrote:
 It's not that hard to figure out.. take some of your typical queries.
 say the one above..  Change the search-term to something you'd expect
 the user to enter in a minute, but hasn't been run. (could be museum
 instead
 of zoo.. then you run it with \timing  and twice.. if the two queries are
 close to each other in timing, then you only hit memory anyway and
 neither SSD, NVRAM or more RAM will buy you anything. Faster memory
 and faster CPU-cores will..  if you have a significant speedup to the
 second run, then more RAM, NVRAM, SSD is a good fix.

 Typically I have slow-query-logging turned on, permanently set to around
 250ms.
 If I find queries in the log that i didnt expect to take above 250ms then
 I'd start to investigate if query-plans are correct .. and so on..

 The above numbers are raw-data size and now how PG uses them.. or?
 And you havent told anything about the size of your current system.

Its definitely the case that the second query run is much faster
(first ones go up to 30 seconds and more...).

PG uses the raw data for Switzerlad like this: 10 GB total disk space
based on 2 GB raw XML input. Table osm_point is one of the four big
tables and uses 984 MB for table and 1321 MB for indexes (where hstore
is the biggest from id, name and geometry).

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Summaries on SSD usage?

2011-09-03 Thread Stefan Keller

2011/9/3 Jesper Krogh jes...@krogh.cc:
 On 2011-09-03 00:04, Stefan Keller wrote:
 It's not that hard to figure out.. take some of your typical queries.
 say the one above..  Change the search-term to something you'd expect
 the user to enter in a minute, but hasn't been run. (could be museum
 instead
 of zoo.. then you run it with \timing  and twice.. if the two queries are
 close to each other in timing, then you only hit memory anyway and
 neither SSD, NVRAM or more RAM will buy you anything. Faster memory
 and faster CPU-cores will..  if you have a significant speedup to the
 second run, then more RAM, NVRAM, SSD is a good fix.

 Typically I have slow-query-logging turned on, permanently set to around
 250ms.
 If I find queries in the log that i didnt expect to take above 250ms then
 I'd start to investigate if query-plans are correct .. and so on..

 The above numbers are raw-data size and now how PG uses them.. or?
 And you havent told anything about the size of your current system.

Its definitely the case that the second query run is much faster
(first ones go up to 30 seconds and more...).

PG uses the raw data for Switzerlad like this: 10 GB total disk space
based on 2 GB raw XML input. Table osm_point is one of the four big
tables and uses 984 MB for table and 1321 MB for indexes (where hstore
is the biggest from id, name and geometry).

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Summaries on SSD usage?

2011-09-02 Thread Stefan Keller

2011/9/2 Scott Marlowe scott.marl...@gmail.com:
 On Tue, Aug 30, 2011 at 11:23 AM, Stefan Keller sfkel...@gmail.com wrote:
 How big is your DB?
 What kind of reads are most common, random access or sequential?
 How big of a dataset do you pull out at once with a query.

 SSDs are usually not a big winner for read only databases.
 If the dataset is small (dozen or so gigs)  get more RAM to fit it in
 If it's big and sequentially accessed, then build a giant RAID-10 or RAID-6
 If it's big and randomly accessed then buy a bunch of SSDs and RAID them

My dataset is a mirror of OpenStreetMap updated daily. For Switzerland
it's about 10 GB total disk space used (half for tables, half for
indexes) based on 2 GB raw XML input. Europe would be about 70 times
larger (130 GB) and world has 250 GB raw input.

It's both randomly (= index scan?) and sequentially (= seq scan?)
accessed with queries like:  SELECT * FROM osm_point WHERE tags @
hstore('tourism','zoo') AND name ILIKE 'Zoo%' .  You can try it
yourself online, e.g.
http://labs.geometa.info/postgisterminal/?xapi=node[tourism=zoo]

So I'm still unsure what's better: SSD, NVRAM (PCI card) or plain RAM?
And I'm eager to understand if unlogged tables could help anyway.

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Summaries on SSD usage?

2011-09-01 Thread Stefan Keller

You mean something like Unlogged Tables in PostgreSQL 9.1 (=
in-memory database) or simply a large ramdisk?

Yours, Stefan

2011/9/1 Jim Nasby j...@nasby.net:
 On Aug 30, 2011, at 12:23 PM, Stefan Keller wrote:
 I'm looking for summaries (or best practices) on SSD usage with PostgreSQL.
 My use case is mainly a read-only database.
 Are there any around?

 I'm not sure, but for read-only why not just put more memory in the server? 
 It'll be a lot cheaper than SSDs.
 --
 Jim C. Nasby, Database Architect                   j...@nasby.net
 512.569.9461 (cell)                         http://jim.nasby.net




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Summaries on SSD usage?

2011-08-30 Thread Stefan Keller

Hi,

I'm looking for summaries (or best practices) on SSD usage with PostgreSQL.
My use case is mainly a read-only database.
Are there any around?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] hstore - Implementation and performance issues around its operators

2011-07-19 Thread Stefan Keller

Hi Robert

Many thanks for your answers.

2011/7/19 Robert Haas robertmh...@gmail.com:
 Putting the elements in order wouldn't really help, would it?  I mean,
 you'd need some kind of an index inside the hstore... which there
 isn't.

Sorry for my inprecise question. In fact elements of a hstore are
stored in order of (keylength,key) with the key comparison done
bytewise (not locale-dependent). See e.g. function hstoreUniquePairs
in http://doxygen.postgresql.org/ . This ordered property is being
used by some hstore functions but not all - and I'm still wondering
why.

Yours, Stefan


2011/7/19 Robert Haas robertmh...@gmail.com:
 On Sun, Jun 19, 2011 at 2:59 PM, Stefan Keller sfkel...@gmail.com wrote:
 1. Obviously the '@' has to be used in order to let use the GiST index.
 Why is the '-' operator not supported by GiST ('-' is actually
 mentioned in all examples of the doc.)?

 Because it's not a comparison operator.

 2. Currently the hstore elements are stored in order as they are
 coming from the insert statement / constructor.
 Why are the elements not ordered i.e. why is the hstore not cached in
 all hstore functions (like hstore_fetchval etc.)?

 Putting the elements in order wouldn't really help, would it?  I mean,
 you'd need some kind of an index inside the hstore... which there
 isn't.

 3. In the source code 'hstore_io.c' one finds the following enigmatic
 note: ... very large hstore values can't be output. this could be
 fixed, but many other data types probably have the same issue.
 What is the max. length of a hstore (i.e. the max. length of the sum
 of all elements in text representation)?

 I think that anything of half a gigabyte or more is at risk of falling
 down there.  But probably it's not smart to use such big hstores
 anyway.

 4. Last, I don't fully understand the following note in the hstore
 doc. (http://www.postgresql.org/docs/current/interactive/hstore.html
 ):
 Notice that the old names are reversed from the convention
 formerly followed by the core geometric data types!

 Why names? Why not rather 'operators' or 'functions'?

 It's referring to the operator names.

 What does this reversed from the convention mean concretely?

 That comment could be a little more clear, but I think what it's
 saying is that hstore's old @ is like the core geometic types old ~,
 and visca versa.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] hstore - Implementation and performance issues around its operators

2011-06-19 Thread Stefan Keller

Hi,

We did a benchmark comparing a Key-Value-Pairs stored as EAV db schema
versus hstore.
The results are promising in favor of hstore but there are some
question which remain.

1. Obviously the '@' has to be used in order to let use the GiST index.
Why is the '-' operator not supported by GiST ('-' is actually
mentioned in all examples of the doc.)?

2. Currently the hstore elements are stored in order as they are
coming from the insert statement / constructor.
Why are the elements not ordered i.e. why is the hstore not cached in
all hstore functions (like hstore_fetchval etc.)?

3. In the source code 'hstore_io.c' one finds the following enigmatic
note: ... very large hstore values can't be output. this could be
fixed, but many other data types probably have the same issue.
What is the max. length of a hstore (i.e. the max. length of the sum
of all elements in text representation)?

4. Last, I don't fully understand the following note in the hstore
doc. (http://www.postgresql.org/docs/current/interactive/hstore.html
):
 Notice that the old names are reversed from the convention
 formerly followed by the core geometric data types!

Why names? Why not rather 'operators' or 'functions'?
What does this reversed from the convention mean concretely?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: FW: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)

2011-05-25 Thread Stefan Keller

Hi all

Thank you to all who answered: That worked:

CREATE INDEX planet_osm_point_tags_amenity
ON planet_osm_point ((tags-'amenity'))
WHERE (tags-'amenity') IS NOT NULL;

My problem is, that in fact I don't know which tag to index since I'm
running a web admin application where users can enter arbitrary
queries.

Yours, Stefan

2011/5/25 Pierre C li...@peufeu.com:
 You wrote

 Try to create a btree index on (bench_hstore-bench_id) WHERE
 (bench_hstore-bench_id) IS NOT NULL.

 What  do you mean exactly?
 = CREATE INDEX myhstore_kps_gin_idx ON myhstore USING gin(kvps) WHERE
 ??? IS NOT NULL;

 My table's def is:

 CREATE TABLE myhstore ( id bigint PRIMARY KEY, kvps hstore NOT NULL );

 So I'm doing something like:
 CREATE INDEX myhstore_kps_gin_idx ON myhstore USING gin(kvps);

 Hello ;

 I meant a plain old btree index like this :

 CREATE INDEX foo ON myhstore((kvps-'yourkeyname')) WHERE
 (kvps-'yourkeyname') IS NOT NULL;

 The idea is that :

 - The reason to use hstore is to have an arbitrary number of keys and use
 the keys you want, not have a fixed set of columns like in a table
 - Therefore, no hstore key is present in all rows (if it was, you'd make
 it a table column, and maybe index it)
 - You'll probably only want to index some of the keys/values (avoiding to
 index values that contain serialized data or other stuff that never
 appears in a WHERE clause)

 So, for each key that corresponds to a searchable attribute, I'd use a
 conditional index on that key, which only indexes the relevant rows. For
 keys that never appear in a WHERE, no index is needed.

 gist is good if you want the intersecton of a hstore with another one (for
 instance), btree is good if you want simple search or range search.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: FW: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)

2011-05-24 Thread Stefan Keller

Salut Pierre

You wrote
 Try to create a btree index on (bench_hstore-bench_id) WHERE
 (bench_hstore-bench_id) IS NOT NULL.

What  do you mean exactly?
= CREATE INDEX myhstore_kps_gin_idx ON myhstore USING gin(kvps) WHERE
??? IS NOT NULL;

My table's def is:
 CREATE TABLE myhstore ( id bigint PRIMARY KEY, kvps hstore NOT NULL );
So I'm doing something like:
CREATE INDEX myhstore_kps_gin_idx ON myhstore USING gin(kvps);

Stefan


2011/5/23 Pierre C li...@peufeu.com:

 Hi Merlin

 The analyze command gave the following result:

 On the KVP table:
 Index Scan using kvpidx on bench_kvp (cost=0.00..8.53 rows=1 width=180)
 (actual time=0.037..0.038 rows=1 loops=1)
 Index Cond: (bench_id = '20_20'::text)
 Total runtime: 0.057 ms

 And on the Hstore table:
 Bitmap Heap Scan on bench_hstore (cost=32.22..3507.54 rows=1000 width=265)
 (actual time=145.040..256.173 rows=1 loops=1)
 Recheck Cond: (bench_hstore @ 'bench_id=20_20'::hstore)
 - Bitmap Index Scan on hidx (cost=0.00..31.97 rows=1000 width=0) (actual
 time=114.748..114.748 rows=30605 loops=1)
 Index Cond: (bench_hstore @ 'bench_id=20_20'::hstore)
 Total runtime: 256.211 ms

 For Hstore I'm using a GIST index.


 Try to create a btree index on (bench_hstore-bench_id) WHERE
 (bench_hstore-bench_id) IS NOT NULL.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)

2011-05-17 Thread Stefan Keller

Hi Jim

You actually made me think about the schema Michel and I are using:

 And KVP is? ;)

CREATE TABLE mykvpstore( id bigint PRIMARY KEY )
CREATE TABLE kvp ( id bigint REFERENCES mykvpstore(id), key text NOT
NULL, value text, );
-- with index on key

And the table with the associative array type (hstore) is:
CREATE TABLE myhstore ( id bigint PRIMARY KEY, kvps hstore NOT NULL );
-- with GIST index on obj

It seems to me that in the mykvpstore-kvp there is also some overhead.

And yes, we have no clue what keys to anticipate, except for some
common ones like 'name': The use case is coming from OpenStreetMap
(http://wiki.openstreetmap.org/wiki/Database_schema ).

Yours, Stefan


2011/5/17 Jim Nasby j...@nasby.net:
 On May 16, 2011, at 8:47 AM, Merlin Moncure wrote:
 On Sat, May 14, 2011 at 5:10 AM, Stefan Keller sfkel...@gmail.com wrote:
 Hi,

 I am conducting a benchmark to compare KVP table vs. hstore and got
 bad hstore performance results when the no. of records is greater than
 about 500'000.

 CREATE TABLE kvp ( id SERIAL PRIMARY KEY, key text NOT NULL, value text );
 -- with index on key
 CREATE TABLE myhstore ( id SERIAL PRIMARY KEY, obj hstore NOT NULL );
 -- with GIST index on obj

 Does anyone have experience with that?

 hstore is not really designed for large-ish sets like that.

 And KVP is? ;)

 IIRC hstore ends up just storing everything as text, with pointers to know 
 where things start and end. There's no real indexing inside hstore, so 
 basically the only thing it can do is scan the entire hstore.

 That said, I would strongly reconsider using KVP for anything except the most 
 trivial of data sets. It is *extremely* inefficient. Do you really have 
 absolutely no idea what *any* of your keys will be? Even if you need to 
 support a certain amount of non-deterministic stuff, I would put everything 
 you possibly can into real fields and only use KVP or hstore for things that 
 you really didn't anticipate.

 Keep in mind that for every *value*, your overhead is 24 bytes for the heap 
 header, 2+ varlena bytes in the heap, plus the length of the key. In the 
 index you're looking at 6+ bytes of overhead, 1+ byte for varlena, plus the 
 length of the key. The PK will cost you an additional 16-24 bytes, depending 
 on alignment. So that's a *minimum* of ~50 bytes per value, and realistically 
 the overhead will be closer to 65-70 bytes, *per value*. Unless your values 
 are decent-sized strings, the overhead is going to be many times larger than 
 the actual data!
 --
 Jim C. Nasby, Database Architect                   j...@nasby.net
 512.569.9461 (cell)                         http://jim.nasby.net




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)

2011-05-14 Thread Stefan Keller

Hi,

I am conducting a benchmark to compare KVP table vs. hstore and got
bad hstore performance results when the no. of records is greater than
about 500'000.

CREATE TABLE kvp ( id SERIAL PRIMARY KEY, key text NOT NULL, value text );
-- with index on key
CREATE TABLE myhstore ( id SERIAL PRIMARY KEY, obj hstore NOT NULL );
-- with GIST index on obj

Does anyone have experience with that?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] How to configure a read-only database server?

2011-04-18 Thread Stefan Keller

I browsed the faq and looked at PostgreSQL performance books but I
could not find the obvious:
How to configure a read-only database server?

I have a single-disk virtual Linux system and a read-only dataset
which is exposed to internet and completely replaced from time to
time.

This is what I found so far:

* Disabling autovacuum daemon.
* Setting postgresql.conf parameters:
   fsync=off
   synchronous_commit=off
   full_page_writes=off

* For the session:
   SET transaction_read_only TO FALSE;
   SET TRANSACTION READ ONLY;

* What about wal_level and archive_mode?

= Any comments on speeding up/optimizing such database server?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Dell PERC H700/H800

2010-02-17 Thread Stefan Kaltenbrunner


Joshua D. Drake wrote:

On Thu, 2010-02-11 at 12:39 +, Matthew Wakeling wrote:
Just a heads up - apparently the more recent Dell RAID controllers will no 
longer recognise hard discs that weren't sold through Dell.


http://www.channelregister.co.uk/2010/02/10/dell_perc_11th_gen_qualified_hdds_only/

As one of the comments points out, that kind of makes them no longer SATA 
or SAS compatible, and they shouldn't be allowed to use those acronyms any 
more.


That's interesting. I know that IBM at least on some of their models
have done the same. Glad I use HP :)


all of the main vendors do that - IBM does and so does HP (unless you 
count the toy boxes without a real raid controller). The later actually 
goes so far and blacklists some of their own hdd firmware levels in more 
recent controller versions which can cause quite some surprising 
results during maintenance operations.
I find it quite strange that people seem to be surprised by Dell now 
starting with that as well (I atually find it really surprising they 
have not done that before).



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Best suiting OS

2009-10-05 Thread Stefan Kaltenbrunner


Stefan Kaltenbrunner wrote:

Devrim GÜNDÜZ wrote:

On Mon, 2009-10-05 at 12:07 +0200, Jean-Michel Pouré wrote:

Go for Debian:
* It is a free community, very active.


Well, we need to state that this is not a unique feature.


* It is guaranteed to be upgradable.


Depends. I had lots of issues with upgrade process in the past -- but
yeah, it is much better than most distros.


* Very easy to administrate via apt-get.


Right. apt is better than yum (in terms of speed).


Choose Debian SID or testing, which will provide the latest fixes.


One thing that I don't like about Debian is their update policy.

If upstream is releasing a security update, I'd like to be able to find
new packages as upstream announces updated sets. Yes, I'm talking about
PostgreSQL here.


This is exactly what Debian does for a while now(at least for PostgreSQL)..
Ie.: Debian Etch aka  has 8.1.18 and Debian Lenny has 8.3.8...


Debian Etch aka oldstable and Debian Lenny (the current release)...


Stefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL 8.4 performance tuning questions

2009-07-31 Thread Stefan Kaltenbrunner


Scott Carey wrote:



On 7/30/09 11:24 AM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote:


Kevin Grittner wrote:

Tom Lane t...@sss.pgh.pa.us wrote:

Kevin Grittner kevin.gritt...@wicourts.gov writes:

Since the dump to custom format ran longer than the full pg_dump
piped directly to psql would have taken, the overall time to use
this technique is clearly longer for our databases on our hardware.

Hmmm ... AFAIR there isn't a good reason for dump to custom format
to take longer than plain text dump, except for applying
compression.  Maybe -Z0 would be worth testing?  Or is the problem
that you have to write the data to a disk file rather than just
piping it?

I did some checking with the DBA who normally copies these around for
development and test environments.  He confirmed that when the source
and target are on the same machine, a pg_dump piped to psql takes
about two hours.  If he pipes across the network, it runs more like
three hours.

My pg_dump to custom format ran for six hours.  The single-transaction
restore from that dump file took two hours, with both on the same
machine.  I can confirm with benchmarks, but this guy generally knows
what he's talking about (and we do create a lot of development and
test databases this way).

Either the compression is tripling the dump time, or there is
something inefficient about how pg_dump writes to the disk.

seems about right - compression in pg_dump -Fc is a serious bottleneck
and unless can significantly speed it up or make it use of multiple
cores (either for the dump itself - which would be awsome - or for the
compression) I would recommend to not use it at all.



That's not an option when a dump compressed is 200GB and uncompressed is
1.3TB, for example.


yeah that was not meant as don't use compression at all but rather as 
use a different way to compress than what pg_dump provides internally.



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL 8.4 performance tuning questions

2009-07-30 Thread Stefan Kaltenbrunner


Tom Lane wrote:

Kevin Grittner kevin.gritt...@wicourts.gov writes:

Since the dump to custom format ran longer than the full pg_dump
piped directly to psql would have taken, the overall time to use this
technique is clearly longer for our databases on our hardware.


Hmmm ... AFAIR there isn't a good reason for dump to custom format to
take longer than plain text dump, except for applying compression.
Maybe -Z0 would be worth testing?  Or is the problem that you have to
write the data to a disk file rather than just piping it?


I always dump with -Z0(and compress afterwards or even in a pipe to get 
two cores busy) because otherwise custom dump times are simply ridiculous.
However Kevin is on something here - on the typical 4-8 core box I 
tested I managed to an around cores/2 speedup for the restore which 
means that for a pure upgrade or testing similiar to what kevin is doing 
custom dumps + parallel restore might result in no win or even a loss.


On on of our datasets I did some benchmarking a while ago (for those who 
attended bruce pg_migrator talk @pgcon these are same numbers):



* 150GB Database (on-disk - ~100GB as a plain text dump)

time to dump(-C0):  120min
time to restore(single threaded):   180min
time to restore(-j 16): 59min

however the problem is that this does not actually mean that parallel 
restore shaves you ~120min in dump/restore time because you get the 
following real runtimes:


plain text dump + single threaded restore in a pipe: 188min
custom dump to file + parallel restore: 179min


this is without compression, with the default custom dump + parallel 
restore is way slower than the simple approach on reasonable hardware.



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL 8.4 performance tuning questions

2009-07-30 Thread Stefan Kaltenbrunner


Kevin Grittner wrote:
Tom Lane t...@sss.pgh.pa.us wrote: 

Kevin Grittner kevin.gritt...@wicourts.gov writes:

Since the dump to custom format ran longer than the full pg_dump
piped directly to psql would have taken, the overall time to use
this technique is clearly longer for our databases on our hardware.

Hmmm ... AFAIR there isn't a good reason for dump to custom format
to take longer than plain text dump, except for applying
compression.  Maybe -Z0 would be worth testing?  Or is the problem
that you have to write the data to a disk file rather than just
piping it?
 
I did some checking with the DBA who normally copies these around for

development and test environments.  He confirmed that when the source
and target are on the same machine, a pg_dump piped to psql takes
about two hours.  If he pipes across the network, it runs more like
three hours.
 
My pg_dump to custom format ran for six hours.  The single-transaction

restore from that dump file took two hours, with both on the same
machine.  I can confirm with benchmarks, but this guy generally knows
what he's talking about (and we do create a lot of development and
test databases this way).
 
Either the compression is tripling the dump time, or there is

something inefficient about how pg_dump writes to the disk.


seems about right - compression in pg_dump -Fc is a serious bottleneck 
and unless can significantly speed it up or make it use of multiple 
cores (either for the dump itself - which would be awsome - or for the 
compression) I would recommend to not use it at all.



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] hyperthreaded cpu still an issue in 8.4?

2009-07-29 Thread Stefan Kaltenbrunner


Greg Smith wrote:

On Wed, 29 Jul 2009, Stefan Kaltenbrunner wrote:

Well the real problem is that pgbench itself does not scale too well 
to lots of concurrent connections and/or to high transaction rates so 
it seriously skews the result.


Sure, but that's what the multi-threaded pgbench code aims to fix, which 
didn't show up until after you ran your tests.  I got the 90K select TPS 
with a completely unoptimized postgresql.conf, so that's by no means the 
best it's possible to get out of the new pgbench code on this hardware. 
I've seen as much as a 40% improvement over the standard pgbench code in 
my limited testing so far, and the patch author has seen a 450% one.  
You might be able to see at least the same results you got from sysbench 
out of it.


oh - the 90k tps are with the new multithreaded pgbench? missed that 
fact. As you can see from my results I managed to get 83k with the 8.4 
pgbench on a slightly slower Nehalem which does not sound too impressive 
for the new code...



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] hyperthreaded cpu still an issue in 8.4?

2009-07-28 Thread Stefan Kaltenbrunner


Greg Smith wrote:

On Tue, 28 Jul 2009, Scott Marlowe wrote:


Just FYI, I ran the same basic test but with -c 10 since -c shouldn't
really be greater than -s


That's only true if you're running the TPC-B-like or other write tests, 
where access to the small branches table becomes a serious hotspot for 
contention.  The select-only test has no such specific restriction as it 
only operations on the big accounts table.  Often peak throughput is 
closer to a very small multiple on the number of cores though, and 
possibly even clients=cores, presumably because it's more efficient to 
approximately peg one backend per core rather than switch among more 
than one on each--reduced L1 cache contention etc.  That's the behavior 
you measured when your test showed better results with c=10 than c=16 on 
a 8 core system, rather than suffering less from the c must be  s 
contention limitation.


Well the real problem is that pgbench itself does not scale too well to 
lots of concurrent connections and/or to high transaction rates so it 
seriously skews the result. If you look 
http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html.
It is pretty clear that 90k(or the 83k I got due to the slower E5530) 
tps is actually a pgench limit and that the backend really can do almost 
twice as fast (I only demonstrated ~140k tps using sysbench there but I 
later managed to do ~160k tps with queries that are closer to what 
pgbench does in the lab)



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 8.4 COPY performance regression on Solaris

2009-06-17 Thread Stefan Kaltenbrunner


Alan Li wrote:

Hi,

It seems that a COPY of 8M rows to a table to 8.4rc1 takes 30% longer 
than it does to 8.3.7 on Solaris.


Here are the steps I've taken to reproduce this problem on two different 
solaris boxes (Solaris 10 11/06 s10x_u3wos_10 X86 and Solaris 10 8/07 
s10x_u4wos_12b X86).  I've tried this on a Linux box, and I do not see 
the problem there.


tried that on my box (though I increased the testset size by 10x to get 
more sensible runtimes) and I can reproduce that on Linux(CentoS 
5.3/x86_64, Nehalem Xeon E5530) as well. I get ~45 rows/s on 8.3 and 
only ~33/s on 8.4




on 8.4 I get:

3m59/4m01/3m56s runtime and a profile of

samples  %symbol name
636302   19.6577  XLogInsert
415510   12.8366  CopyReadLine
2253476.9618  DoCopy
1311434.0515  ParseDateTime
1220433.7703  DecodeNumber
81730 2.5249  DecodeDate
81045 2.5038  DecodeDateTime
80900 2.4993  pg_verify_mbstr_len
80235 2.4787  pg_next_dst_boundary
67571 2.0875  LWLockAcquire
64548 1.9941  heap_insert
64178 1.9827  LWLockRelease
63609 1.9651  PageAddItem
63402 1.9587  heap_form_tuple
56544 1.7468  timestamp_in
48697 1.5044  heap_fill_tuple
45248 1.3979  pg_atoi
42390 1.3096  IsSystemRelation
41287 1.2755  BufferGetBlockNumber
38936 1.2029  ValidateDate
36619 1.1313  ExecStoreTuple
35367 1.0926  DecodeTime

on 8.3.7 I get 2m58s,2m54s,2m55s

and a profile of:

samples  %symbol name
460966   16.2924  XLogInsert
307386   10.8643  CopyReadLine
301745   10.6649  DoCopy
1534525.4236  pg_next_dst_boundary
1197574.2327  DecodeNumber
1053563.7237  heap_formtuple
83456 2.9497  ParseDateTime
83020 2.9343  pg_verify_mbstr_len
72735 2.5708  DecodeDate
70425 2.4891  LWLockAcquire
65820 2.3264  LWLockRelease
61823 2.1851  DecodeDateTime
55895 1.9756  hash_any
51305 1.8133  PageAddItem
47440 1.6767  AllocSetAlloc
47218 1.6689  heap_insert
38912 1.3753  DecodeTime
34871 1.2325  ReadBuffer_common
34519 1.2200  date2j
33093 1.1696  DetermineTimeZoneOffset
31334 1.1075  MemoryContextAllocZero
30951 1.0939  RelationGetBufferForTuple

If I do the same test utilizing WAL bypass the picture changes:

8.3 runtimes:2m16,2min14s,2min22s

and profile:

samples  %symbol name
445583   16.  CopyReadLine
332772   12.5300  DoCopy
1569745.9106  pg_next_dst_boundary
1319524.9684  heap_formtuple
1191144.4850  DecodeNumber
94340 3.5522  ParseDateTime
81624 3.0734  pg_verify_mbstr_len
75012 2.8245  DecodeDate
74950 2.8221  DecodeDateTime
64467 2.4274  hash_any
62859 2.3669  PageAddItem
62054 2.3365  LWLockAcquire
57209 2.1541  LWLockRelease
45812 1.7250  hash_search_with_hash_value
41530 1.5637  DetermineTimeZoneOffset
40790 1.5359  heap_insert
39694 1.4946  AllocSetAlloc
38855 1.4630  ReadBuffer_common
36056 1.3576  MemoryContextAllocZero
36030 1.3567  DecodeTime
29057 1.0941  UnpinBuffer
28291 1.0653  PinBuffer


8.4 runtime: 2m1s,2m,1m59s

and profile:
404775   17.9532  CopyReadLine
2084829.2469  DoCopy
1488986.6042  ParseDateTime
1186455.2623  DecodeNumber
80972 3.5914  DecodeDate
79005 3.5042  pg_verify_mbstr_len
73645 3.2664  PageAddItem
72167 3.2009  DecodeDateTime
65264 2.8947  heap_form_tuple
52680 2.3365  timestamp_in
46264 2.0520  pg_next_dst_boundary
45819 2.0322  ExecStoreTuple
45745 2.0290  heap_fill_tuple
43690 1.9378  heap_insert
38453 1.7055  InputFunctionCall
37050 1.6433  LWLockAcquire
36853 1.6346  BufferGetBlockNumber
36428 1.6157  heap_compute_data_size
33818 1.5000  DetermineTimeZoneOffset
33468 1.4844  DecodeTime
30896 1.3703  tm2timestamp
30888 1.3700  GetCurrentTransactionId


Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-13 Thread Stefan Kaltenbrunner


Greg Stark wrote:

On Mon, May 11, 2009 at 5:05 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:

Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-)

well not really - while it is fairly easy to get postgresql running on a PS3
it is not a fast platform. While the main CPU there is a pretty fast Power
based core it only has 256MB of Ram and a single SATA disk available(though
you could add some USB disks).


The nice thing about it is that TPC-C and other benchmarks all specify
their bottom-line number in some unit like Transaction per second PER
DOLLAR. So using a PS3 should be able to get ridiculously good results
compared to expensive server hardware...


I kinda doubt that - the PS3 is certainly not server grade hardware so 
you can only compare it to a desktop and I would bet that the typical 
desktop you get for the 400€(you can get 4GB RAM a quadcore CPU for 
that) price of a PS3 is going to outperform it significantly for almost 
every workload...



Stefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stefan Kaltenbrunner


Dimitri wrote:

Folks, before you start to think what a dumb guy doing a dumb thing :-))
I'll explain you few details:

it's for more than 10 years I'm using a db_STRESS kit
(http://dimitrik.free.fr/db_STRESS.html) to check databases
performance and scalability. Until now I was very happy with results
it gave me as it stress very well each database engine internals an
put on light some things I should probably skip on other workloads.
What do you want, with a time the fast query executed before in
500ms now runs within 1-2ms  - not only hardware was improved but also
database engines increased their performance a lot! :-))


I was attempting to look into that benchmark kit a bit but I find the 
information on that page a bit lacking :( a few notices:


* is the sourcecode for the benchmark actually available? the kit 
seems to contain a few precompiled binaries and some source/headfiles 
but there are no building instructions, no makefile or even a README 
which makes it really hard to verify exactly what the benchmark is doing 
or if the benchmark client might actually be the problem here.


* there is very little information on how the toolkit talks to the 
database - some of the binaries seem to contain a static copy of libpq 
or such?


* how many queries per session is the toolkit actually using - some 
earlier comments seem to imply you are doing a connect/disconnect cycle 
for every query ist that actually true?



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stefan Kaltenbrunner


Dimitri wrote:

Hi Stefan,

sorry, I did not have a time to bring all details into the toolkit -
but at least I published it instead to tell a nice story about :-)


fair point and appreciated. But it seems important that benchmarking 
results can be verified by others as well...




The client process is a binary compiled with libpq. Client is
interpreting a scenario script and publish via SHM a time spent on
each SQL request. I did not publish sources yet as it'll also require
to explain how to compile them :-)) So for the moment it's shipped as
a freeware, but with time everything will be available (BTW, you're
the first who asking for sources (well, except IBM guys who asked to
get it on POWER boxes, but it's another story :-))


well there is no licence tag(or a copyright notice) or anything als 
associated with the download which makes it a bit harder than it really 
needs to be.
The reason why I was actually looking for the source is that all my 
available benchmark platforms are none of the ones you are providing 
binaries for which kinda reduces its usefulness.




What is good is each client is publishing *live* its internal stats an
we're able to get live data and follow any kind of waves in
performance. Each session is a single process, so there is no
contention between clients as you may see on some other tools. The
current scenario script contains 2 selects (representing a Read
transaction) and delete/insert/update (representing Write
transaction). According a start parameters each client executing a
given number Reads per Write. It's connecting on the beginning and
disconnecting at the end of the test.


well I have seen clients getting bottlenecked internally (like wasting 
more time in getting rid/absorbing of the actual result than it took the 
server to generate the answer...).
How sure are you that your live publishing of data does not affect the 
benchmark results(because it kinda generates an artifical think time) 
for example?
But what I get from your answer is that you are basically doing one 
connect/disconnect per client and the testcase you are talking about has 
256 clients?



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stefan Kaltenbrunner


Matthew Wakeling wrote:

On Tue, 12 May 2009, Simon Riggs wrote:

won't connect operations be all handled by a
single thread - the parent postmaster?


No, we spawn then authenticate.


But you still have a single thread doing the accept() and spawn. At some 
point (maybe not now, but in the future) this could become a bottleneck 
given very short-lived connections.


well the main cost is backend startup and that one is extremely 
expensive (compared to the cost of a simple query and also depending on 
the OS). We have more overhead there than other databases (most notably 
MySQL) hence what prompted my question on how the benchmark was operating.
For any kind of workloads that contain frequent connection 
establishments one wants to use a connection pooler like pgbouncer(as 
said elsewhere in the thread already).



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-11 Thread Stefan Kaltenbrunner


Paolo Rizzi wrote:
Are you saying that PostgreSQL+PostGIS can actually run on a 
smartphone??? Intriguing...

Did anyone ever actually tried that???


If it's a supported CPU type and you've got a suitable build toolchain,
sure.  Seven or eight years ago we were getting a good laugh out of the
fact that you could run PG on a PlayStation 2.

Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-)


well not really - while it is fairly easy to get postgresql running on a 
PS3 it is not a fast platform. While the main CPU there is a pretty fast 
Power based core it only has 256MB of Ram and a single SATA disk 
available(though you could add some USB disks).



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to unique-ify HUGE table?

2008-12-23 Thread Stefan Kaltenbrunner


Scott Marlowe wrote:

On Tue, Dec 23, 2008 at 11:14 AM, George Pavlov gpav...@mynewplace.com wrote:

You don't say what PG version you are on, but just for kicks you may try
using GROUP BY instead of DISTINCT. Yes, the two should perform the
same, but with 8.1 (or maybe 8.0) I had seen situations where GROUP BY
was faster (admittedly this happened with more complex queries). So, try
this:


Even in 8.3 it looks like group by is faster.  Tested it on a decent
sized table and group by used a hash agg and ran in ~600 ms, while
distinct used a sort and ran in 1300 ms.  That was on 500k rows.  On a
much larger table, one with about 10M rows, a similar statement runs
in 1500 ms with group by and in 2390 ms when run with distinct.


Not surprising - this is a known limitation in all released versions of 
postgresql (GROUP BY can use hashing and sorting - DISTINCT only 
sorting). 8.4 is going to improve that though.



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Explain Analyze - Total runtime very differentes

2008-10-19 Thread Stefan Kaltenbrunner


[EMAIL PROTECTED] wrote:

Hello friends ...

I'm evaluating the performance of algorithms for optimization of queries.
I am comparing results between the algorithm of Dynamic Programming and an
implementation of Kruskal's algorithm. When submitting a query that makes
reference to only 2 tables of my base, logically the same Query Plan is
shown. But the Total runtime displayed by the command Explain-Analyze
presents a variation of time very high:

Dynamic Programming Total runtime: 1204.220 ms

Kruskal Total runtime: 3744.879 ms

No change of data (insert, delete, update) in the tables was made during
the tests. The same query was submitted several times (with Kruskal and
Dynamic Programming algorithms) and the variation of results persists.

The explain analyze only reports the time to run *execute* the query.
With the same Query Plan, does not understand why this variation occurs.

In annex the Query Plans


sure it it not something as simple as a caching effect - ie you run the 
slow variant first and pg and/or the OS buffered data and the repeated 
execution just got a benefit from that ?


Try running all variations a few dozend times both in cached and 
uncached state and you should see the difference getting leveled out.



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] pg_dump error - out of memory, Failed on request of size 536870912

2008-08-06 Thread Stefan Kaltenbrunner


Marcin Citowicki wrote:

Hello,

I forgot to add - all those 'out of memory' errors happen when backup db 
is trying to create index. Every 'CREATE INDEX' operation is followed by 
'out of memory' error.


are you sure that your OS (or ulimit) is able to support a 
maintenance_work_setting that large ? - try reducing to a say 128MB for 
a start and try again.



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] query plan, index scan cost

2008-07-18 Thread Stefan Zweig

 hi list,

i have a problem with time consuming query. first of all my table structure:

CREATE TABLE nw_tla_2008_4_deu
(
ID bigint NOT NULL,
NET2CLASS smallint,
FOW smallint,
CONSTRAINT nw_tla_2008_4_deu_pkey PRIMARY KEY (ID),
)
WITHOUT OIDS;

CREATE INDEX nw_tla_2008_4_deu_fow_idx
ON nw_tla_2008_4_deu
USING btree
(FOW);

CREATE INDEX nw_tla_2008_4_deu_net2class_idx
ON nw_tla_2008_4_deu
USING btree
(NET2CLASS);

CREATE INDEX nw_tla_2008_4_deu_the_geom_gist
ON nw_tla_2008_4_deu
USING gist
(the_geom gist_geometry_ops);
ALTER TABLE nw_tla_2008_4_deu CLUSTER ON nw_tla_2008_4_deu_the_geom_gist;


when i run the following query with explain analyze i get the following result:

EXPLAIN
ANALYZE

SELECT
nw.ID AS id

FROM
nw_tla_2008_4_deu AS nw

WHERE
expand(st_pointfromtext('POINT(13.7328934 51.049476)',4326), 
0.2476961598054)  nw.the_geom
AND nw.FOW IN (1,2,3,4,10,17)
AND nw.NET2CLASS IN (0,1,2,3)

Bitmap Heap Scan on nw_tla_2008_4_deu nw (cost=35375.52..77994.15 rows=11196 
width=8) (actual time=13307.830..13368.969 rows=15425 loops=1)

Recheck Cond: (NET2CLASS = ANY ('{0,1,2,3}'::integer[]))

Filter: 
(('010320E610010005C06BF82A4000A0A066494000C06BF82A4000C009A6494000E00FF62B4000C009A6494000E00FF62B4000A0A066494000C06BF82A4000A0A0664940'::geometry
  the_geom) AND (FOW = ANY ('{1,2,3,4,10,17}'::integer[])))

- BitmapAnd (cost=35375.52..35375.52 rows=12614 width=0) (actual 
time=13307.710..13307.710 rows=0 loops=1)

- Bitmap Index Scan on nw_tla_2008_4_deu_the_geom_gist (cost=0.00..1759.12 
rows=55052 width=0) (actual time=22.452..22.452 rows=52840 loops=1)

Index Cond: 
('010320E610010005C06BF82A4000A0A066494000C06BF82A4000C009A6494000E00FF62B4000C009A6494000E00FF62B4000A0A066494000C06BF82A4000A0A0664940'::geometry
  the_geom)

- Bitmap Index Scan on nw_tla_2008_4_deu_net2class_idx (cost=0.00..33610.55 
rows=1864620 width=0) (actual time=13284.121..13284.121 rows=2021814 loops=1)

Index Cond: (NET2CLASS = ANY ('{0,1,2,3}'::integer[]))

Total runtime: *13.332* ms


running the next query which is only slightly different and has one instead of 
two and conditions leads to the following result

EXPLAIN
ANALYZE

SELECT
nw.ID AS id

FROM
nw_tla_2008_4_deu AS nw

WHERE
expand(st_pointfromtext('POINT(13.7328934 51.049476)',4326), 
0.2476961598054)  nw.the_geom
AND nw.FOW IN (1,2,3,4,10,17)


Bitmap Heap Scan on nw_tla_2008_4_deu nw (cost=1771.34..146161.54 rows=48864 
width=8) (actual time=23.285..99.493 rows=47723 loops=1)

Filter: 
(('010320E610010005C06BF82A4000A0A066494000C06BF82A4000C009A6494000E00FF62B4000C009A6494000E00FF62B4000A0A066494000C06BF82A4000A0A0664940'::geometry
  the_geom) AND (FOW = ANY ('{1,2,3,4,10,17}'::integer[])))

- Bitmap Index Scan on nw_tla_2008_4_deu_the_geom_gist (cost=0.00..1759.12 
rows=55052 width=0) (actual time=22.491..22.491 rows=52840 loops=1)

Index Cond: 
('010320E610010005C06BF82A4000A0A066494000C06BF82A4000C009A6494000E00FF62B4000C009A6494000E00FF62B4000A0A066494000C06BF82A4000A0A0664940'::geometry
  the_geom)

Total runtime: *109*ms


so in both querys there are and conditions. there are two and conditions in the 
first query and one and condition in the second query. unfortunately i am not 
an expert in reading the postgre query plan. basically i am wondering why in 
the first query a second index scan is done whereas in the second query the 
second index scan is not done. the second query runs hundred times faster then 
first one which surprising to me.

any ideas?

regards, stefan

_
In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! 
Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Weinzierl Stefan


Laszlo Nagy schrieb:
[...]
The RAID controller that I have selected can only handle 8 disks. I 
guess I need to find a different one with 16 channels and use more 
disks. So are you saying that with all disks in a bigger RAID 6 array, I 
will get the most out of the hardware? In that case, I'll try to get a 
bit more money from the management and build RAID 6 with 12 disks.


Here a good SATA-Controllers for 4/8/12/16-Disks:
http://www.tekram.com/product2/product_detail.asp?pid=51

Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] URI to kind of a benchmark

2007-12-12 Thread Stefan Kaltenbrunner


Harald Armin Massa wrote:
reading postgres benchmarks for beginners advises to stop reading on the 
words default (ie. unchanged postgresql.conf); but the real test is 
given right after:


http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html

That confirmes my first impression (on different workload) of the speed 
has doubled.


If reality confirmes, that 8.2 to 8.3 will be a small step in versions, 
and a great step in databases.


yeah but keep in mind that this one only tested a very specific scenario 
(update heavy, workload fits easily in buffercache and benefits from 
HOT) - it is a fairly neat improvement though ...



Stefan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] DRBD and Postgres: how to improve the perfomance?

2007-09-08 Thread Stefan Kaltenbrunner

Joshua D. Drake wrote:
 Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
 You're right, but the distinction is a small one. What are the chances
 of losing two independent servers within a few milliseconds of each
 other? 
 If they're on the same power bus?
 
 That chance is minuscule or at least should be. Of course we are
 assuming some level of conditioned power that is independent of the
 power bus, e.g; a UPS.

how is that making it different in practise ? - if both are on the same
UPS they are affectively on the same power bus ...
If the UPS fails (or the generator is not kicking in which happens way
more often than people would believe) they could still fail at the very
same time 


Stefan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] DRBD and Postgres: how to improve the perfomance?

2007-09-08 Thread Stefan Kaltenbrunner

Joshua D. Drake wrote:
 Stefan Kaltenbrunner wrote:
 Joshua D. Drake wrote:
 Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 You're right, but the distinction is a small one. What are the chances
 of losing two independent servers within a few milliseconds of each
 other? 
 If they're on the same power bus?
 That chance is minuscule or at least should be. Of course we are
 assuming some level of conditioned power that is independent of the
 power bus, e.g; a UPS.
 how is that making it different in practise ? - if both are on the same
 UPS they are affectively on the same power bus ...
 
 Well I was thinking the bus that is in the wall. I would assume that
 people were smart enough to have independent UPS systems for each server.
 
 city power-line conditioning generator-panel-plug-UPS-server
 
 wash, rinse repeat.

the typical datacenter version of this is actually more like:

city power-UPS (with generator in parallel)-panel-plug

or

city power-flywheel-(maybe UPS)-panel-plug

it is not really that common to have say two different UPS feeds in your
rack (at least not for normal housing or the average corporate
datacenter) - mostly you get two feeds from different power distribution
panels (so different breakers) but that's about it.
Having a local UPS attached is usually not really that helpful either
because those have limited capacity need space and are an additional
thing that can (and will) fail.


Stefan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Stefan Kaltenbrunner

Julius Stroffek wrote:
 Hi All,
 
 Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
 the ideas about other possible optimizer algorithms to be used
 in PostgreSQL.
 
 We are quite new to PostgreSQL project so it took us some
 time to go through the sources end explore the possibilities
 how things could be implemented.
 
 There is a proposal attached to this mail about the interface
 we would like to implement for switching between different
 optimizers. Please review it and provide a feedback to us.
 Thank You.

hmm - how does is that proposal different from what got implemented with:

http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread Stefan Kaltenbrunner


Marc Mamin wrote:


Postgres configuration for 64 CPUs, 128 GB RAM...


there are probably not that much installation out there that large - 
comments below




Hello,

We have the oppotunity to benchmark our application on a large server. I 
have to prepare the Postgres configuration and I'd appreciate some 
comments on it as I am not experienced with servers of such a scale. 
Moreover the configuration should be fail-proof as I won't be able to 
attend the tests.


Our application (java + perl) and Postgres will run on the same server, 
whereas the application activity is low when Postgres has large 
transactions to process.


There is a large gap between our current produtcion server (Linux, 4GB 
RAM, 4 cpus) and the benchmark server; one of the target of this  
benchmark is to verify the scalability of our application.




[...]

Posgres version: 8.2.1


upgrade to 8.2.4


File system:

_http://en.wikipedia.org/wiki/ZFS_


way more important is what kind of disk-IO subsystem you have attached ...





Planned configuration:


# we don't expect more than 150 parallel connections,
# but I suspect a leak in our application that let some idle connections 
open


max_connections=2000

ssl = off

#maximum allowed
shared_buffers= 262143


this is probably on the lower side for a 128GB box



# on our current best production server with 4GB RAM (not dedicated to 
Postgres), work_mem is set to 600 MB
# this limitation is probably the bottleneck for our application as the 
files in pgsql_tmp grows up to 15 GB
# during large aggregations (we have a locking mechanismus to avoid 
parallel processing of such transactions)

work_mem = 31457280  # (30 GB)


this is simply ridiculous - work_mem is PER SORT - so if your query 
requires 8 sorts it will feel free to use 8x30GB and needs to be 
multiplied by the number of concurrent connections.




# index creation time is also an issue for us; the process is locking 
other large processes too.

# our largest table so far is 13 GB + 11 GB indexes
maintenance_work_mem = 31457280  # (30 GB)


this is ridiculous too - testing has shown that there is not much point 
in going beyond 1GB or so




# more than the max number of tables +indexes expected during the benchmark
max_fsm_relations = 10

max_fsm_pages = 180


this is probably way to low for a database the size of yours - watch the 
oputput of VACUUM VERBOSE on a database wide vacuum for some stats on that.




# don't know if I schoud modify this.
# seems to be sufficient on our production servers
max_stack_depth = 2MB

# vacuum will be done per hand between each test session
autovacuum = off



# required to analyse the benchmark
log_min_duration_statement = 1000


max_prepared_transaction = 100


# seems to be required to drop schema/roles containing large number of 
objects

max_locks_per_transaction = 128




# I use the default for the bgwriter as I couldnt find recommendation on 
those


#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers 
scanned/round

#bgwriter_lru_maxpages = 5  # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333   # 0-100% of all buffers 
scanned/round

#bgwriter_all_maxpages = 5  # 0-1000 buffers max written/round


#WAL

fsync = on

#use default
#wal_sync_method

# we are using 32 on our production system
wal_buffers=64


values up to 512 or so have been reported to help on systems with very 
high concurrency



what is missing here is your settings for:

effective_cache_size

and

random_page_cost



Stefan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] PostgreSQL publishes first real benchmark

2007-07-12 Thread Stefan Kaltenbrunner

Jignesh K. Shah wrote:
 Can you list others that seemed out of place?

well to me the ones that look most questionable are:

work_mem=100MB - so this benchmark is really low concurrency(which does
not fit with max_connections=1000) and with trivial queries ?

enable_seqscan = off - why ?

effective_cache_size = 40GB - on a box with 16GB this seems wrong
especially since there are some indications out there that suggest that
while overestimating effective_cache_size was not a problem in versions
8.2 it might not be so in 8.2 and up

wal_buffers = 2300 - there have been some numbers reported that going
over the default of 8 helps but it is generally considered that going
beyond 500 or maybe 1000 does not help at all ...


and one more is that you claim you used -fast -O4 -xtarget=ultraT1
which is something we explicitly advise against in our own
FAQ(http://www.postgresql.org/docs/faqs.FAQ_Solaris.html):

Do not use any flags that modify behavior of floating point operations
and errno processing (e.g.,-fast).  These flags could raise some
nonstandard PostgreSQL behavior for example in the date/time computing.



Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-21 Thread Stefan Kaltenbrunner

Campbell, Lance wrote:
 Now I am at the difficult part, what parameters to calculate and how to
 calculate them.  Everything below has to do with PostgreSQL version 8.2:
 
  
 
 The parameters I would think we should calculate are:
 
 max_connections
 
 shared_buffers
 
 work_mem
 
 maintenance_work_mem
 
 effective_cache_size
 
 random_page_cost
 
  
 
 Any other variables?  I am open to suggestions.


we also should scale max_fsm_pages according to the database size and
workload answers - I also note that the configuration file it generates
seems to look like on for PostgreSQL 7.x or something - I think we
should just include the specific parameters to change.


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] upgraded to pgsql 8.2.4, getting worse performance then 7.4.x

2007-06-02 Thread Stefan Kaltenbrunner

Michael Fuhr wrote:
 On Sat, Jun 02, 2007 at 09:13:32AM -0400, Douglas J Hunley wrote:
 Our 'esteemed' Engr group recently informed a customer that in their 
 testing, 
 upgrading to 8.2.x improved the performance of our J2EE 
 application approximately 20%, so of course, the customer then tasked me 
 with upgrading them. We dumped their db, removed pgsql, installed the 8.2.4 
 rpms from postgresql.org, did an initdb, and the pg_restored their data. 
 It's 
 been about a week now, and the customer is complaining that in their 
 testing, 
 they are seeing a 30% /decrease/ in general performance.
 
 After the restore, did you ANALYZE the entire database to update
 the planner's statistics?  Have you enabled autovacuum or are you
 otherwise vacuuming and analyzing regularly?  What kind of queries
 are slower than desired?  If you post an example query and the
 EXPLAIN ANALYZE output then we might be able to see if the slowness
 is due to query plans.
 
 A few differences between the configuration files stand out.  The
 7.4 file has the following settings:
 
   shared_buffers = 25000
   sort_mem = 15000
   effective_cache_size = 196608
 
 The 8.2 config has:
 
   #shared_buffers = 32MB
   #work_mem = 1MB
   #effective_cache_size = 128MB
 
 To be equivalent to the 7.4 config the 8.2 config would need:
 
   shared_buffers = 195MB
   work_mem = 15000kB
   effective_cache_size = 1536MB
 
 With 8GB of RAM you might try increasing shared_buffers to 400MB - 800MB
 (less if the entire database isn't that big) and effective_cache_size
 to 5GB - 6GB.  You might have to increase the kernel's shared memory
 settings before increasing shared_buffers.

some testing here has shown that while it is usually a good idea to set
effective_cache_size rather optimistically in versions 8.2 it is
advisable to make it accurate or even a bit less than that in 8.2 and up.


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Domains versus Check Constraints

2007-05-27 Thread Stefan Kaltenbrunner

Jim C. Nasby wrote:
 On Tue, May 22, 2007 at 12:56:21PM -0400, Chander Ganesan wrote:
 Are there any performance improvements that come from using a domain 
 over a check constraint (aside from the ease of management component)?
 
 No. Plus support for domain constraints isn't universal (plpgsql doesn't
 honor them, for example).

since 8.2 domain constraints are enforced everywhere ...


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] more on high load on postgres 7.4.16

2007-04-06 Thread Stefan Kaltenbrunner

Geoffrey wrote:
 We are trying to attack this problem from multiple avenues, thus I'm
 starting a separate thread.  This is with regard to the problem posted
 via thread:
 
 http://archives.postgresql.org/pgsql-performance/2007-04/msg00120.php
 
 One thing we are seeing with this move to the new hardware (and rhas 4)
 is database connection processes that are left over by users who have
 exited the application.  I've attached to these processes via gdb and
 find they all have the same backtrace.  Any insights into what might be
 causing this issue would be appreciated.  Understand, we did not have
 this problem on the previous hardware running on rhes 3.  Here is the
 backtrace:
 
 #0  0x00ba47a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
 #1  0x0019f1de in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
 #2  0x0019ca7a in _L_mutex_lock_23 () from /lib/tls/libpthread.so.0
 #3  0xbfed9438 in ?? ()
 #4  0x00c96a4e in pthread_cond_destroy@@GLIBC_2.3.2 () from
 /lib/tls/libc.so.6
 #5  0x00c96a4e in pthread_cond_destroy@@GLIBC_2.3.2 () from
 /lib/tls/libc.so.6
 #6  0x0015243f in critSec::~critSec () from
 /usr/local/pcm170/libdalkutil.so
 #7  0x003a48b8 in Comp_ZipFiles () from /usr/local/pcm170/libcompress.so

/usr/local on RHEL should only contain software installed directly from
source - what exactly is pcm170/libdalkutil ?
beside that - is pg actually compiled with debugging symbols on that
platform ?


Stefan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] SCSI vs SATA

2007-04-04 Thread Stefan Kaltenbrunner


Joshua D. Drake wrote:




Good point. On another note, I am wondering why nobody's brought up 
the command-queuing perf benefits (yet). Is this because sata vs scsi 
are at 


SATAII has similar features.

par here? I'm finding conflicting information on this -- some calling 
sata's ncq mostly crap, others stating the real-world results are 
negligible. I'm inclined to believe SCSI's pretty far ahead here but 
am having trouble finding recent articles on this.


What I find is, a bunch of geeks sit in a room and squabble about a few 
percentages one way or the other. One side feels very l33t because their 
white paper looks like the latest swimsuit edition.


Real world specs and real world performance shows that SATAII performs, 
very, very well. It is kind of like X86. No chip engineer that I know 
has ever said, X86 is elegant but guess which chip design is conquering 
all others in the general and enterprise marketplace?


SATAII brute forces itself through some of its performance, for example 
16MB write cache on each drive.


sure but for any serious usage one either wants to disable that 
cache(and rely on tagged command queuing or how that is called in SATAII 
world) or rely on the OS/raidcontroller implementing some sort of 
FUA/write barrier feature(which linux for example only does in pretty 
recent kernels)



Stefan

---(end of broadcast)---
TIP 6: explain analyze is your friend

1 2 >

1 - 100 of 122 matches

Mail list logo