Re: [PERFORM] utilizing multiple disks for i/o performance

2006-05-22 Thread Scott Marlowe
On Fri, 2006-05-19 at 21:37, Kenji Morishige wrote:
 Where can I find any documentation to partition the tablespace disk files onto
 different physical arrays for improved performance?

There have been quite a few posts to this list in the past about this,
so searching it might be a good start.

Firstly, you need to defined improved performance.  Are we talking
transactional throughput (OLTP), or batch updates (ETL), or report
generation (OLAP stuff)???  Or some other scenario.

For write performance, the general rules are:

You can only commit 1 transaction per rotation of a properly fsynced
disc that holds the WAL file (i.e. the pg_xlog directory).  So, putting
that on it's own fast spinning disc is step one for improved
performance.

A battery backed cache unit (BBU) on a RAID controller is a must.

RAID 1+0 is a good choice for your data partition.

For many hardware RAID controllers with the above mentioned BBU moving
the pg_xlog to another partition is no real help.

Cheap RAID controllers are often worse than no RAID controller.  If you
can't afford a good RAID controller, you're probably better off with
software RAID than using a cheapie.

For READ performance:

Often settings in postgresql.conf are far more important than the drive
layout.  

Lots of RAM is a good thing.

Assuming you've got lots of RAM, making shared_buffers anywhere from 10
to 25% of it is a pretty good size.

work_mem usually works well at around 16 meg or so.

drop random_page_cost to about 2 for most systems.

Lastly, read this:

http://www.varlena.com/GeneralBits/Tidbits/perf.html

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How can I make this query faster (resend)

2006-05-22 Thread Jim C. Nasby
On Fri, May 19, 2006 at 03:56:49PM -0700, Cstdenis wrote:
 (Its been a hour and I dont see my message on the list so I'm sending it 
 again. I've moved the queries and analyze out of the email incase it was 
 rejected because too long)
 
 query: http://pastebin.ca/57218
 
 In the pictures table all the ratings have a shared index 
  
 CREATE INDEX idx_rating ON pictures USING btree  (rating_nudity, 
 rating_violence, rating_sex, rating_racism, rating_spoilers, rating_yaoi, 
 rating_yuri, rating_profanity);
  
 and approved and date_submitted and user_id also have their own btree indexes.
  
 In the picture_categories table pid and cat_id have their own btree indices 
 plus one together. 
 
 Full table definition: http://pastebin.ca/57219
 
 the cat_id and rating values vary from query to query. The one listed above 
 took 54 seconds in a test run just now. Here is explain analyze: 
 http://pastebin.ca/57220
 
pictures is the interesting table here. It looks like the planner would
do better to choose something other than a nested loop on it. Try
running EXPLAIN ANALYZE on the query with enable_nestloop=off and see
what you get (you'll need to compare it to what you get with
enable_nestloop on to see what the change is).

 Both pictures and picture categories have about 287,000 rows
  
 This query needs to run in under about a second or it kills my site by 
 clogging apache slots (apache maxes out at 256 and I can have several hundred 
 people on my site at a time). How can I make it run faster?
  
  
 Server is a dual xeon with a gig of ram dedicated mostly to postgresql.
 Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222

I suspect the low work_mem may be why it's using a nested loop. In
addition to the test above, it would be interesting to see what happens
to the plan if you set work_mem to 1.

To be honest, you're pushing things expecting a machine with only 1G to
serve 300 active connections. How large is the database itself?

 I know hyperthreading is considered something that can slow down a server but 
 with my very high concurancy (averages about 400-500 concurant users during 
 peak hours) I am hoping the extra virtual CPUs wil help. Anyone have 
 experance that says diferent at high concurancy?

Best bet is to try it and see. Generally, people find HT hurts, but I
recently saw it double the performance of pgbench on a windows XP
machine, so it's possible that windows is just more clever about how to
use it than linux is.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] utilizing multiple disks for i/o performance

2006-05-22 Thread Jim C. Nasby
On Fri, May 19, 2006 at 07:37:45PM -0700, Kenji Morishige wrote:
 Where can I find any documentation to partition the tablespace disk files onto
 different physical arrays for improved performance?

Other than CREATE TABLESPACE??
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performs WAY better with enable_seqscan = off

2006-05-22 Thread Jim C. Nasby
On Sun, May 21, 2006 at 02:01:14PM -0600, Brendan Duddridge wrote:
 When the query planner uses the indexes with the concatenated values  
 and the where clause, the query can be sub-second response times (but  
 not always depending on the category_id value). By just doing a  
 regular join as you suggested, it's always slower. The trick is  
 getting Postgres to use the proper index all the time. And so far the  
 only way I can do that is by turning off sequential scans, but that's  
 something I didn't want to do because I don't know how it would  
 affect the performance of the rest of my application.
 
You can always disable them for just that query...
BEGIN;
SET LOCAL enable_seqscan=off;
SELECT ...
COMMIT;

 Just a note, I have random_page_cost set to 1 to try and get it to  
 favour index scans. The database machine has 8GB of RAM and I have  
 effective_cache_size set to 2/3 of that.

That's rather low for that much memory; I'd set it to 7GB.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] How can I make this query faster (resend)

2006-05-22 Thread Markus Schaber
Hi, Cstendis,

Cstdenis wrote:

 Server is a dual xeon with a gig of ram dedicated mostly to postgresql.
 Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222

3M is really low for a production server.

Try using pg_pool and limiting it to about 30 or so backend connections,
and then give them at least 30 megs of RAM each.

This should also cut down the connection creation overhead.

HTH,
Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performs WAY better with enable_seqscan = off

2006-05-22 Thread Jim C. Nasby
The problem is that the planner is guessing horribly at what the nodes
will return, and I'm betting the reason for that is your join criteria.
Why are you joining on fields that are concatenated together, instead of
just joining on the fields themselves? That's a sure-fire way to confuse
the planner, and greatly limit your options.

On Sun, May 21, 2006 at 02:21:55AM -0600, Brendan Duddridge wrote:
 Hi,
 
 I have a query that performs WAY better when I have enable_seqscan =  
 off:
 
 explain analyze select ac.attribute_id, la.name, ac.sort_order from  
 attribute_category ac, localized_attribute la where ac.category_id =  
 1001402 and la.locale_id = 101 and ac.is_browsable = 'true' and  
 la.attribute_id = ac.attribute_id and exists ( select 'x' from  
 product_attribute_value pav, category_product cp where  
 (pav.product_id || '.' || pav.attribute_id) = (cp.product_id || '.'  
 || ac.attribute_id) and pav.status_code is null and (cp.category_id  
 || '.' || cp.is_visible) = '1001402.true') order by (ac.sort_order is  
 null), ac.sort_order, la.name asc;
 
QUERY PLAN
  
  
  
 --
 Sort  (cost=47.97..47.98 rows=7 width=34) (actual  
 time=33368.721..33368.721 rows=2 loops=1)
Sort Key: (ac.sort_order IS NULL), ac.sort_order, la.name
-  Nested Loop  (cost=2.00..47.87 rows=7 width=34) (actual  
 time=13563.049..33368.679 rows=2 loops=1)
  -  Index Scan using attribute_category__category_id_fk_idx  
 on attribute_category ac  (cost=0.00..26.73 rows=7 width=8) (actual  
 time=13562.918..33368.370 rows=2 loops=1)
Index Cond: (category_id = 1001402)
Filter: (((is_browsable)::text = 'true'::text) AND  
 (subplan))
SubPlan
  -  Nested Loop  (cost=0.02..278217503.21  
 rows=354763400 width=0) (actual time=4766.821..4766.821 rows=0 loops=7)
-  Seq Scan on category_product cp   
 (cost=0.00..158150.26 rows=18807 width=4) (actual  
 time=113.595..4585.461 rows=12363 loops=7)
  Filter: category_id)::text ||  
 '.'::text) || (is_visible)::text) = '1001402.true'::text)
-  Index Scan using  
 product_attribute_value__prod_id_att_id_status_is_null_ids on  
 product_attribute_value pav  (cost=0.02..14171.84 rows=18863 width=8)  
 (actual time=0.012..0.012 rows=0 loops=86538)
  Index Cond: pav.product_id)::text  
 || '.'::text) || (pav.attribute_id)::text) =  
 (((outer.product_id)::text || '.'::text) || ($0)::text))
  -  Bitmap Heap Scan on localized_attribute la   
 (cost=2.00..3.01 rows=1 width=30) (actual time=0.129..0.129 rows=1  
 loops=2)
Recheck Cond: (la.attribute_id = outer.attribute_id)
Filter: (locale_id = 101)
-  Bitmap Index Scan on  
 localized_attribute__attribute_id_fk_idx  (cost=0.00..2.00 rows=1  
 width=0) (actual time=0.091..0.091 rows=1 loops=2)
  Index Cond: (la.attribute_id =  
 outer.attribute_id)
 Total runtime: 33369.105 ms
 
 Now when I disable sequential scans:
 
 set enable_seqscan = off;
 
 explain analyze select ac.attribute_id, la.name, ac.sort_order from  
 attribute_category ac, localized_attribute la where ac.category_id =  
 1001402 and la.locale_id = 101 and ac.is_browsable = 'true' and  
 la.attribute_id = ac.attribute_id and exists ( select 'x' from  
 product_attribute_value pav, category_product cp where  
 (pav.product_id || '.' || pav.attribute_id) = (cp.product_id || '.'  
 || ac.attribute_id) and pav.status_code is null and (cp.category_id  
 || '.' || cp.is_visible) = '1001402.true') order by (ac.sort_order is  
 null), ac.sort_order, la.name asc;
 
QUERY PLAN
  
  
  
 --
 Sort  (cost=48.09..48.11 rows=7 width=34) (actual  
 time=1675.944..1675.945 rows=2 loops=1)
Sort Key: (ac.sort_order IS NULL), ac.sort_order, la.name
-  Nested Loop  (cost=2.00..48.00 rows=7 width=34) (actual  
 time=687.600..1675.831 rows=2 loops=1)
  -  Index Scan using attribute_category__category_id_fk_idx  
 on attribute_category ac  (cost=0.00..26.86 rows=7 width=8) (actual  
 time=687.441..1675.584 rows=2 loops=1)
Index Cond: (category_id = 1001402)
Filter: (((is_browsable)::text = 'true'::text) AND  
 (subplan))

[PERFORM] Query hanging/not finishing inconsistently

2006-05-22 Thread Meetesh Karia
Hi all,We've recently started having a problem where a query that normally executes in ~15ms starts to take upwards of 20s to complete. When the connection that ran query is returned to the connection pool, it appears as though a transaction is still in progress so the connection pool tries to cancel the transaction and close the connection. This fails and the connection is removed from the connection pool. At this point, the situation rapidly degrades and we run out of connections to the postgres server.
An inspection of the pg_stat_activity table shows that practically every connection is running the above-mentioned query and some of those queries have been active for many minutes! We've looked at the pg_locks table as well and the only exclusive locks are on transactions that are open. All other locks are AccessShareLocks. Also, as far as we can tell (from looking at the Hibernate stats), every db session that is opened is closed.
When this happens, if I kill one of the running postgres processes (just by picking the last process returned from ps -ef | grep postgres), the other queries will immediately finish and the system will respond. However, within 15 minutes, we'll be back in the same state as before. At that point, I've cycled Apache, Tomcat and Postgres and the system then seems to come back.
This problem appears to be unrelated to load and in fact, the majority of the time there is very little load on the site when this occurs. We've run load tests on our dev boxes but we've been unable to reproduce the problem. We're currently working on playing back the clicks on the site previous to the weird state the site gets in and at the same time, we were wondering if anyone has experienced a problem like this or has any suggestions.
The query in question is:select distinct s.screening_id, f.film_id, f.title, s.period_start, f.runtime, c.value, v.short_name, s.parent_id,
 stats.avg_rating, coalesce(stats.num_adds, 0) as num_adds, coalesce(stats.unique_visits, 0) as unique_visits,
 f.*, s.* from lte_screening s
 inner join lte_film f on s.film_id = f.film_id inner join lte_venue v on 
s.venue_id = v.venue_id inner join lte_film_classifier c on c.film_id = f.film_id
 left join lte_film_stats stats on stats.context = :context and stats.film_id = s.film_id
 where c.name = ? and s.period_start is not null and s.festival_id = ? and s.period_start
 between ? + ? and ? + ? order by s.period_start, f.title;And the result of explain analyze:
QUERY PLANUnique (cost=1117.42..1118.71 rows=11 width=866) (actual time=
18.306..18.386 rows=15 loops=1) - Sort (cost=1117.42..1117.44 rows=11 width=866) (actual time=18.300..18.316 rows=15 loops=1)
 Sort Key: s.period_start, f.title, s.screening_id, f.film_id, f.runtime, c.value, v.short_name, s.parent_id, 
stats.avg_rating, COALESCE(stats.num_adds, 0), COALESCE(stats.unique_visits, 0::bigint), f.film_id, f.sku, f.title, f.template, f.release_date, f.runtime, f.language, f.country, f.mpaa_rating, f.synopsis
, f.owner, f.ext_sales_rank, f.small_image_url, f.medium_image_url, f.large_image_url, f.detail_page, f.to_delete, f.coalesce_to, (subplan), (subplan), s.screening_id, s.period_start, s.period_end, s.ticket_price
, s.tickets_avail, s.tickets_sold, s.type, s.venue_id, s.festival_id, s.film_id, s.parent_id, s.ext_id, s.purchase_url, s.status, s.status_update_time
 - Nested Loop Left Join (cost=2.62..1117.23 rows=11 width=866) (actual time=2.656..17.773 rows=15 loops=1)
 - Nested Loop (cost=2.62..976.00 rows=11 width=846) (actual time=2.347..16.162 rows=15 loops=1) - Hash Join (cost=
2.62..929.09 rows=10 width=831) (actual time=2.217..15.480 rows=15 loops=1) Hash Cond: (outer.venue_id = inner.venue_id)
 - Nested Loop (cost=0.00..926.32 rows=10 width=818) (actual time=1.915..14.974 rows=15 loops=1)
 - Seq Scan on lte_screening s (cost=0.00..886.67 rows=10 width=159) (actual time=
1.830..14.314 rows=15 loops=1) Filter: ((period_start IS NOT NULL) AND (festival_id = 316372) AND (period_start = '2006-05-19 05:00:00'::timestamp without time zone) AND (period_start = '2006-05-20 04:59:59'::timestamp without time zone))
 - Index Scan using lte_film_pkey on lte_film f (cost=0.00..3.95 rows=1 width=659) (actual time=
0.026..0.028 rows=1 loops=15) Index Cond: (outer.film_id = f.film_id
) - Hash (cost=2.50..2.50 rows=50 width=21) (actual time=0.215..0.215 rows=0 loops=1)
 - Seq Scan on lte_venue v (cost=0.00..2.50 rows=50 width=21) (actual time=0.012..0.126
 rows=52 loops=1) - Index Scan using idx_classifier_film on lte_film_classifier c (cost=0.00..4.67
 rows=2 width=23) (actual time=0.026..0.028 rows=1 loops=15) Index Cond: (c.film_id = outer.film_id)
 Filter: ((name)::text = 'FestivalCategory'::text)
 - Index Scan using lte_film_stats_pkey on lte_film_stats stats (cost=0.00..4.34 rows=1 width=28) (actual time=0.034..0.037 rows=1 loops=15)
 Index Cond: ((stats.context = 316372) AND (stats.film_id = outer.film_id))
 SubPlan - Index Scan using idx_collateral_film on 

Re: [PERFORM] Query hanging/not finishing inconsistently

2006-05-22 Thread Craig A. James

Meetesh Karia wrote:

Hi all,

We've recently started having a problem where a query that normally 
executes in ~15ms starts to take upwards of 20s to complete.  When the 
connection that ran query is returned to the connection pool, it appears 
as though a transaction is still in progress so the connection pool 
tries to cancel the transaction and close the connection.  This fails 
and the connection is removed from the connection pool.  At this point, 
the situation rapidly degrades and we run out of connections to the 
postgres server.


An inspection of the pg_stat_activity table shows that practically every 
connection is running the above-mentioned query and some of those 
queries have been active for many minutes!  We've looked at the pg_locks 
table as well and the only exclusive locks are on transactions that are 
open.  All other locks are AccessShareLocks.  Also, as far as we can 
tell (from looking at the Hibernate stats), every db session that is 
opened is closed.


When this happens, if I kill one of the running postgres processes (just 
by picking the last process returned from ps -ef | grep postgres), the 
other queries will immediately finish and the system will respond.  
However, within 15 minutes, we'll be back in the same state as before.  
At that point, I've cycled Apache, Tomcat and Postgres and the system 
then seems to come back.


This sounds suspiciously like a question I asked a few weeks ago, on April 4.  
I have a process that just gets stuck.  After some questions from various of 
the experts in this forum, I used gdb(1) to attach to one of the frozen 
Postgress backend processes, and here's what I found:

On 5/12/2006, I wrote:

Thanks, good advice.  You're absolutely right, it's stuck on a
mutex.  After doing what you suggest, I discovered that the query
in progress is a user-written function (mine).  When I log in as
root, and use gdb -p pid to attach to the process, here's
what I find.  Notice the second function in the stack, a mutex
lock:

(gdb) bt
#0  0x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2  0x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6
#3  0x4f5fc1b4 in ?? ()
#4  0x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from 
/usr/local/pgsql/lib/libchmoogle.so
#5  0x009ffcf0 in ?? () from /usr/lib/libz.so.1
#6  0xbfe71c04 in ?? ()
#7  0xbfe71e50 in ?? ()
#8  0xbfe71b78 in ?? ()
#9  0x009f7019 in zcfree () from /usr/lib/libz.so.1
#10 0x009f7019 in zcfree () from /usr/lib/libz.so.1
#11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1
#12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at 
zipstreamimpl.h:332
#13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, 
pin=0xffea) at istream:115
#14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, 
pOb=0xbfd923b8) at obconversion.cpp:780
#15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120
#16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120
#17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243
#18 0x0810ae4d in ExecMakeFunctionResult ()
#19 0x0810de2e in ExecProject ()
#20 0x08115972 in ExecResult ()
#21 0x08109e01 in ExecProcNode ()
#22 0x0020 in ?? ()
#23 0xbed4b340 in ?? ()
#24 0xbf92d9a0 in ?? ()
#25 0xbed4b0c0 in ?? ()
#26 0x in ?? ()

It looks to me like my code is trying to read the input parameter
(a fairly long string, maybe 2K) from a buffer that was gzip'ed
by Postgres for the trip between the client and server... somewhere
along the way, a mutex gets set, and then ... it's stuck forever.

ps(1) shows that this thread had been running for about 7 hours,
and the job status showed that this function had been
successfully called about 1 million times, before this mutex lock
occurred.


This is not an issue that's been resolved.  Nobody had ever seen this before.  
Tom Lane suggested it might be a libc/c++ bug, but unfortunately in my case 
this lockup occurs so rarely (every few days) that it will be very difficult to 
know if we've fixed the problem.

If gdb(1) reveals that your process is stuck in a mutex, then you might have a 
better chance testing this hypothesis, since your problem happens within 15 
minutes or so.

Did this start recently, perhaps right after a kernel update?

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Query hanging/not finishing inconsistently

2006-05-22 Thread Meetesh Karia
Hi Craig,Thanks for your response. This did start recently and it wasn't after a kernel update, but it was after we moved the db from Machine B to Machine A (which have slightly different kernel versions). However, the problem took about a week to show up after we moved from one machine to the other. Unfortunately, the problem only reappears after 15 mins once it occurs the first time. If it occurs again today I'll attach gdb to it and see whether it's stuck on a mutex.
MeeteshOn 5/22/06, Craig A. James
 [EMAIL PROTECTED] wrote:

Meetesh Karia wrote: Hi all, We've recently started having a problem where a query that normally executes in ~15ms starts to take upwards of 20s to complete.When the connection that ran query is returned to the connection pool, it appears
 as though a transaction is still in progress so the connection pool tries to cancel the transaction and close the connection.This fails and the connection is removed from the connection pool.At this point,
 the situation rapidly degrades and we run out of connections to the postgres server. An inspection of the pg_stat_activity table shows that practically every connection is running the above-mentioned query and some of those
 queries have been active for many minutes!We've looked at the pg_locks table as well and the only exclusive locks are on transactions that are open.All other locks are AccessShareLocks.Also, as far as we can
 tell (from looking at the Hibernate stats), every db session that is opened is closed. When this happens, if I kill one of the running postgres processes (just by picking the last process returned from ps -ef | grep postgres), the
 other queries will immediately finish and the system will respond. However, within 15 minutes, we'll be back in the same state as before. At that point, I've cycled Apache, Tomcat and Postgres and the system
 then seems to come back.This sounds suspiciously like a question I asked a few weeks ago, on April 4.I have a process that just gets stuck.After some questions from various of the experts in this forum, I used gdb(1) to attach to one of the frozen Postgress backend processes, and here's what I found:
On 5/12/2006, I wrote: Thanks, good advice.You're absolutely right, it's stuck on a mutex.After doing what you suggest, I discovered that the query in progress is a user-written function (mine).When I log in as
 root, and use gdb -p pid to attach to the process, here's what I find.Notice the second function in the stack, a mutex lock: (gdb) bt #00x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-
linux.so.2 #10x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6 #20x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6 #30x4f5fc1b4 in ?? () #40x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from /usr/local/pgsql/lib/libchmoogle.so
 #50x009ffcf0 in ?? () from /usr/lib/libz.so.1 #60xbfe71c04 in ?? () #70xbfe71e50 in ?? () #80xbfe71b78 in ?? () #90x009f7019 in zcfree () from /usr/lib/libz.so.1 #10 0x009f7019 in zcfree () from /usr/lib/libz.so.1
 #11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1 #12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at zipstreamimpl.h:332 #13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, pin=0xffea) at istream:115
 #14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, pOb=0xbfd923b8) at obconversion.cpp:780 #15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120 #16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120
 #17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243 #18 0x0810ae4d in ExecMakeFunctionResult () #19 0x0810de2e in ExecProject () #20 0x08115972 in ExecResult () #21 0x08109e01 in ExecProcNode ()
 #22 0x0020 in ?? () #23 0xbed4b340 in ?? () #24 0xbf92d9a0 in ?? () #25 0xbed4b0c0 in ?? () #26 0x in ?? () It looks to me like my code is trying to read the input parameter
 (a fairly long string, maybe 2K) from a buffer that was gzip'ed by Postgres for the trip between the client and server... somewhere along the way, a mutex gets set, and then ... it's stuck forever.
 ps(1) shows that this thread had been running for about 7 hours, and the job status showed that this function had been successfully called about 1 million times, before this mutex lock

 occurred.This is not an issue that's been resolved.Nobody had ever seen this before.Tom Lane suggested it might be a libc/c++ bug, but unfortunately in my case this lockup occurs so rarely (every few days) that it will be very difficult to know if we've fixed the problem.
If gdb(1) reveals that your process is stuck in a mutex, then you might have a better chance testing this hypothesis, since your problem happens within 15 minutes or so.Did this start recently, perhaps right after a kernel update?
Craig



[PERFORM] slow query using sub select

2006-05-22 Thread Tim Jones
Hi,
  I am having a problem with a sub select query being kinda slow.  The
query is as follows:
 
select batterycode, batterydescription, observationdate from Battery t1
where patientidentifier=611802158 and observationdate = (select
max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode
and patientidentifier=611802158) order by batterydescription.
 
explain analyze:


'Sort (cost=1697.16..1697.16 rows=1 width=31) (actual
time=910.721..910.729 rows=22 loops=1)'
' Sort Key: batterydescription'
' - Index Scan using ix_battery_patient on battery t1
(cost=0.00..1697.15 rows=1 width=31) (actual time=241.836..910.580
rows=22 loops=1)'
' Index Cond: (patientidentifier = 611802158)'
' Filter: (observationdate = (subplan))'
' SubPlan'
' - Aggregate (cost=26.25..26.26 rows=1 width=8) (actual
time=9.666..9.667 rows=1 loops=94)'
' - Bitmap Heap Scan on battery t2 (cost=22.23..26.25 rows=1 width=8)
(actual time=9.606..9.620 rows=7 loops=94)'
' Recheck Cond: ((patientidentifier = 611802158) AND
((batterycode)::text = ($0)::text))'
' - BitmapAnd (cost=22.23..22.23 rows=1 width=0) (actual
time=9.596..9.596 rows=0 loops=94)'
' - Bitmap Index Scan on ix_battery_patient (cost=0.00..2.20 rows=58
width=0) (actual time=0.039..0.039 rows=94 loops=94)'
' Index Cond: (patientidentifier = 611802158)'
' - Bitmap Index Scan on ix_battery_code (cost=0.00..19.78 rows=2794
width=0) (actual time=9.514..9.514 rows=27323 loops=94)'
' Index Cond: ((batterycode)::text = ($0)::text)'
'Total runtime: 910.897 ms'

Basically I am just trying to display the batterycode with its most
recent date.  Is there a better way to do this query ?

thanks
 
 
Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] slow query using sub select

2006-05-22 Thread Tom Lane
Tim Jones [EMAIL PROTECTED] writes:
   I am having a problem with a sub select query being kinda slow.  The
 query is as follows:
 
 select batterycode, batterydescription, observationdate from Battery t1
 where patientidentifier=611802158 and observationdate = (select
 max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode
 and patientidentifier=611802158) order by batterydescription.

Yeah, this is essentially impossible for the planner to optimize,
because it doesn't see any way to de-correlate the subselect, so it does
it over again for every row.  You might find it works better if you cast
the thing as a SELECT DISTINCT ON problem (look at the weather report
example in the SELECT reference page).

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] slow query using sub select

2006-05-22 Thread Jonathan Blitz


 -Original Message-
 From: Tim Jones [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 23, 2006 12:11 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] slow query using sub select
 
 Hi,
   I am having a problem with a sub select query being kinda slow.  The
 query is as follows:
 
 select batterycode, batterydescription, observationdate from Battery t1
 where patientidentifier=611802158 and observationdate = (select
 max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode
 and patientidentifier=611802158) order by batterydescription.


How about changing it into a standard join:


select t1.batterycode, t1.batterydescription, t2.observationdate
from Battery t1, 
(Select batterycode ,max(observationdate) from Battery t2 where
patientidentifier=611802158 group by batterycode) AS T2
where t1. batterycode = t2. batterycode

Jonathan Blitz
AnyKey Limited
Israel

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.6.1/344 - Release Date: 05/19/2006
 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq