Re: [PERFORM] Delete Cascade FK speed issue

2007-07-03 Thread Michael Fuhr
On Tue, Jul 03, 2007 at 08:05:27AM +0200, Patric de Waha wrote:
 Is there a way I can find out what postgres does, and where it hangs 
 around, so I know where the FK might not be indexed. (The dbase is
 to big to analyze it by hand).

You could query the system catalogs to look for foreign key constraints
that don't have an index on the referencing column(s).  Something like
the following should work for single-column foreign keys:

select n1.nspname,
   c1.relname,
   a1.attname,
   t.conname,
   n2.nspname as fnspname,
   c2.relname as frelname,
   a2.attname as fattname
  from pg_constraint t
  join pg_attribute  a1 on a1.attrelid = t.conrelid and a1.attnum = t.conkey[1]
  join pg_class  c1 on c1.oid = t.conrelid
  join pg_namespace  n1 on n1.oid = c1.relnamespace
  join pg_class  c2 on c2.oid = t.confrelid
  join pg_namespace  n2 on n2.oid = c2.relnamespace
  join pg_attribute  a2 on a2.attrelid = t.confrelid and a2.attnum = 
t.confkey[1]
 where t.contype = 'f'
   and not exists (
 select 1
   from pg_index i
  where i.indrelid = t.conrelid
and i.indkey[0] = t.conkey[1]
   )
 order by n1.nspname,
  c1.relname,
  a1.attname;

-- 
Michael Fuhr

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Filesystem Direct I/O and WAL sync option

2007-07-03 Thread Dimitri

All,

I'm very curious to know if we may expect or guarantee any data
consistency with WAL sync=OFF but using file system mounted in Direct
I/O mode (means every write() system call called by PG really writes
to disk before return)...

So may we expect data consistency:
  - none?
  - per checkpoint basis?
  - full?...

Thanks a lot for any info!

Rgds,
-Dimitri

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Filesystem Direct I/O and WAL sync option

2007-07-03 Thread Heikki Linnakangas

Dimitri wrote:

I'm very curious to know if we may expect or guarantee any data
consistency with WAL sync=OFF but using file system mounted in Direct
I/O mode (means every write() system call called by PG really writes
to disk before return)...


You'd have to turn that mode on on the data drives as well to get 
consistency, because fsync=off disables checkpoint fsyncs of the data 
files as well.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Filesystem Direct I/O and WAL sync option

2007-07-03 Thread Dimitri

Yes, disk drives are also having cache disabled or having cache on
controllers and battery protected (in case of more  high-level
storage) - but is it enough to expect data consistency?... (I was
surprised about checkpoint sync, but does it always calls write()
anyway? because in this way it should work without fsync)...

On 7/3/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:

Dimitri wrote:
 I'm very curious to know if we may expect or guarantee any data
 consistency with WAL sync=OFF but using file system mounted in Direct
 I/O mode (means every write() system call called by PG really writes
 to disk before return)...

You'd have to turn that mode on on the data drives as well to get
consistency, because fsync=off disables checkpoint fsyncs of the data
files as well.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Delete Cascade FK speed issue

2007-07-03 Thread Mark Lewis
On Tue, 2007-07-03 at 08:05 +0200, Patric de Waha wrote:
 Hi,
 I've dbase with about 80 relations.
 On deleting a user, this cascades through all the tables.
 This is very slow, for 20 users it takes 4 hours, with exclusive 
 access to the dbase.
 No other users connected to the dbase.
 
 Ok I know there will be somewhere a relation with a FK without 
 index, which
 is being scanned sequentially. But how can I find out what postgres 
 is doing
 while it is handling the transaction?
 
 Is there a way I can find out what postgres does, and where it hangs 
 around, so I know
 where the FK might not be indexed. (The dbase is to big to analyze 
 it by hand).
 
 The  way I do it now is to check the pg_locks relation, but this is 
 not very representative.
 
 Is there profiling method for triggers/constraints, or a method 
 which gives me a hint
 why it is taking so long?

In 8.1 and later, an EXPLAIN ANALYZE of the delete will show you the
amount of time spent in each trigger.  Remember that it will still
perform the delete, so if you want to be able to re-run the DELETE over
and over as you add missing indexes, run it in a transaction and
rollback each time.  That will tell you which foreign key constraint
checks are taking up time.  The output will not be nearly as useful if
you don't name your foreign key constraints, but is still better than
nothing.

Alternatively, you can just dump the schema to a text file and spend 30
minutes and some text searching to reconstruct your foreign key
dependency graph rooted at the table in question and check each column
for proper indexes.  We recently did this for a 150 relation database,
it's not as painful as you seem to think it is.  An 80 relation database
is by no means too big to analyze :)

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL 8.0 occasionally slow down

2007-07-03 Thread Ho Fat Tsang

Hi Greg.

2007/6/28, Greg Smith [EMAIL PROTECTED]:


On Thu, 28 Jun 2007, Ho Fat Tsang wrote:

 I have tuned the checkpoint_timeout to 30 second which is ten times less
 than default and the issue is still reproduced.


Doing a checkpoint every 30 seconds is crazy; no wonder your system is

pausing so much.  Put the timeout back to the default.  What you should do
here is edit your config file and set checkpoint_warning to its maximum of
3600.  After that, take a look at the log files; you'll then get a warning
message every time a checkpoint happens.  If those line up with when
you're getting the slowdowns, then at least you'll have narrowed the cause
of your problem, and you can get some advice here on how to make the
overhead of checkpoints less painful.

The hint it will give is probably the first thing to try: increase
checkpoint_segments from the default to something much larger (if it's at
3 now, try 10 instead to start), and see if the problem stops happening as
frequently.  Your problem looks exactly like a pause at every checkpoint,
and I'm not sure what Richard was thinking when he suggested having them
more often would improve things.



Yes, Thank you for your suggestion. i have found that the slowdown time does
not align to checkpoint after i turned on the warning. The issue is related
what Richard has been mentioned - Something outsides PG doing many write
operations to pages.

--

* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



Re: [PERFORM] PostgreSQL 8.0 occasionally slow down

2007-07-03 Thread Ho Fat Tsang

2007/6/29, Richard Huxton [EMAIL PROTECTED]:


Ho Fat Tsang wrote:
 Hi Richard,

I've tested again according your suggestion. I noticed that for each
 time the pgsql slow down, there is a short period a process called
 pdflush
 eating up lot of I/O. I've goolgled and know it is a process for writing
 dirty pages back to the disk by the Linux kernel. I will have further
 investigation on this process with my limited knowledge on Linux kernel.

Well, pdflush is responsible for flushing dirty pages to disk on behalf
of all processes.

If it's doing it every 3 minutes while checkpoints are happening every
30 seconds then I don't see how it's PG that's responsible.

There are three possibilities:
1. PG isn't actually checkpointing every 30 seconds.
2. There is a burst of query activity every 3 minutes that causes a lot
of writing.
3. Some other process is responsible.



Exactly ! you are right, finally i have found that the root cause for this
is the application that use PG. There is memory leak using MappedByteBuffer
(writing in java), it leads high I/O loading and finally reaches the ratio
that pdflush is being kicked start in the kernel.

Thank you for helping a lot in digging out this issue ! learn much for you
guys !


  Correct me if i am wrong. It seems postgresql 8.0 does not bundle
 auto-vacuum by default. So all vacuum and analyse are done manually ? So
 what i have tested related to vaccuum is running auto-vacuum (a
executeable
 located in /bin) parallel under normal production load but it seems
won't
 help.

Can't remember whether 8.0 had autovacuum bundled and turned off or not
bundled at all. If it's not running it can't be causing this problem
though.

--
   Richard Huxton
   Archonet Ltd



Re: [PERFORM] PostgreSQL 8.0 occasionally slow down

2007-07-03 Thread Ho Fat Tsang

2007/7/3, Greg Smith [EMAIL PROTECTED]:


On Fri, 29 Jun 2007, Ho Fat Tsang wrote:

 I noticed that for each time the pgsql slow down, there is a short
 period a process called pdflush eating up lot of I/O. I've goolgled
 and know it is a process for writing dirty pages back to the disk by the
 Linux kernel.

The pdflush documentation is really spread out, you may find my paper at
http://www.westnet.com/~gsmith/content/linux-pdflush.htm a good place to
start looking into that.



When i found the pdflush process is the major clue of PG slow down, i
googled and found this article !
it is a really good one for tuning pdflush ! Thank a lot !

--

* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



[PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread smiley2211

This query is taking less than 5 minutes on 7.4 but over 5 hours on 8.1...

PostgreSQL 8.1.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.0 (SUSE Linux)
Total runtime: 20448310.101 ms = 5.6800862 hour
(132 rows)

--postgresql.conf:

shared_buffers = 114688 # min 16 or max_connections*2, 8KB
each
#temp_buffers = 2   # min 100, 8KB each
#max_prepared_transactions = 5  # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 10240# size in KB
maintenance_work_mem = 64384# min 1024, size in KB
max_stack_depth = 4096  # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 50  # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1000# min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1   # 0-1 credits
#vacuum_cost_page_miss = 10 # 0-1 credits
#vacuum_cost_page_dirty = 20# 0-1 credits
#vacuum_cost_limit = 200# 0-1 credits

# - Background writer -

#bgwriter_delay = 200   # 10-1 milliseconds between
rounds
#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers
scanned/round
#bgwriter_lru_maxpages = 5  # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333   # 0-100% of all buffers
scanned/round
#bgwriter_all_maxpages = 5  # 0-1000 buffers max written/round


#---
# WRITE AHEAD LOG
#---

# - Settings -

#fsync = on # turns forced synchronization on or
off
#wal_sync_method = fsync# the default is the first option
# supported by the operating system:
#   open_datasync
#   fdatasync
#   fsync
#   fsync_writethrough
#   open_sync
#full_page_writes = on  # recover from partial page writes
#wal_buffers = 8# min 4, 8KB each
#commit_delay = 0   # range 0-10, in microseconds
#commit_siblings = 5# range 1-1000

# - Checkpoints -

checkpoint_segments = 12# in logfile segments, min 1, 16MB
each
#checkpoint_timeout = 300   # range 30-3600, in seconds
#checkpoint_warning = 30# in seconds, 0 is off

# - Archiving -

#archive_command = ''   # command to use to archive a
logfile
# segment


#---
# QUERY TUNING
#---

# - Planner Method Configuration -

enable_bitmapscan = off
enable_hashagg = on
enable_hashjoin = on
enable_indexscan = on
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = off
enable_sort = on
enable_tidscan = on

# - Planner Cost Constants -

effective_cache_size = 1# typically 8KB each
random_page_cost = 4# units are one sequential page
fetch
# cost
#cpu_tuple_cost = 0.01  # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025 # (same)
#---
# LOCK MANAGEMENT
#---

#deadlock_timeout = 1000# in milliseconds
#max_locks_per_transaction = 64 # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.

-- 
View this message in context: 
http://www.nabble.com/Query-is-taking-5-HOURS-to-Complete-on-8.1-version-tf4019778.html#a11416966
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread Bill Moran
In response to smiley2211 [EMAIL PROTECTED]:
 
 This query is taking less than 5 minutes on 7.4 but over 5 hours on 8.1...
 
 PostgreSQL 8.1.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
 4.1.0 (SUSE Linux)
 Total runtime: 20448310.101 ms = 5.6800862 hour
 (132 rows)

When was the last time you vacuum analyzed the database?

Also, you don't even provide the query.  I can't imagine how you'd expect
anyone to help you.  If vacuum analyze doesn't fix the problem, please
provide the query, explain output of the query, and the schema of any
tables involved, including information on indexes.

 
 --postgresql.conf:
 
 shared_buffers = 114688 # min 16 or max_connections*2, 8KB
 each
 #temp_buffers = 2   # min 100, 8KB each
 #max_prepared_transactions = 5  # can be 0 or more
 # note: increasing max_prepared_transactions costs ~600 bytes of shared
 memory
 # per transaction slot, plus lock space (see max_locks_per_transaction).
 work_mem = 10240# size in KB
 maintenance_work_mem = 64384# min 1024, size in KB
 max_stack_depth = 4096  # min 100, size in KB
 
 # - Free Space Map -
 
 max_fsm_pages = 50  # min max_fsm_relations*16, 6 bytes each
 max_fsm_relations = 1000# min 100, ~70 bytes each
 
 # - Kernel Resource Usage -
 
 #max_files_per_process = 1000   # min 25
 #preload_libraries = ''
 
 # - Cost-Based Vacuum Delay -
 
 #vacuum_cost_delay = 0  # 0-1000 milliseconds
 #vacuum_cost_page_hit = 1   # 0-1 credits
 #vacuum_cost_page_miss = 10 # 0-1 credits
 #vacuum_cost_page_dirty = 20# 0-1 credits
 #vacuum_cost_limit = 200# 0-1 credits
 
 # - Background writer -
 
 #bgwriter_delay = 200   # 10-1 milliseconds between
 rounds
 #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers
 scanned/round
 #bgwriter_lru_maxpages = 5  # 0-1000 buffers max written/round
 #bgwriter_all_percent = 0.333   # 0-100% of all buffers
 scanned/round
 #bgwriter_all_maxpages = 5  # 0-1000 buffers max written/round
 
 
 #---
 # WRITE AHEAD LOG
 #---
 
 # - Settings -
 
 #fsync = on # turns forced synchronization on or
 off
 #wal_sync_method = fsync# the default is the first option
 # supported by the operating system:
 #   open_datasync
 #   fdatasync
 #   fsync
 #   fsync_writethrough
 #   open_sync
 #full_page_writes = on  # recover from partial page writes
 #wal_buffers = 8# min 4, 8KB each
 #commit_delay = 0   # range 0-10, in microseconds
 #commit_siblings = 5# range 1-1000
 
 # - Checkpoints -
 
 checkpoint_segments = 12# in logfile segments, min 1, 16MB
 each
 #checkpoint_timeout = 300   # range 30-3600, in seconds
 #checkpoint_warning = 30# in seconds, 0 is off
 
 # - Archiving -
 
 #archive_command = ''   # command to use to archive a
 logfile
 # segment
 
 
 #---
 # QUERY TUNING
 #---
 
 # - Planner Method Configuration -
 
 enable_bitmapscan = off
 enable_hashagg = on
 enable_hashjoin = on
 enable_indexscan = on
 enable_mergejoin = on
 enable_nestloop = on
 enable_seqscan = off
 enable_sort = on
 enable_tidscan = on
 
 # - Planner Cost Constants -
 
 effective_cache_size = 1# typically 8KB each
 random_page_cost = 4# units are one sequential page
 fetch
 # cost
 #cpu_tuple_cost = 0.01  # (same)
 #cpu_index_tuple_cost = 0.001   # (same)
 #cpu_operator_cost = 0.0025 # (same)
 #---
 # LOCK MANAGEMENT
 #---
 
 #deadlock_timeout = 1000# in milliseconds
 #max_locks_per_transaction = 64 # min 10
 # note: each lock table slot uses ~220 bytes of shared memory, and there are
 # max_locks_per_transaction * (max_connections + max_prepared_transactions)
 # lock table slots.
 

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of 

Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread smiley2211

Here is the EXPLAIN after I changed some conf file - now I am running another
EXPLAIN ANALYZE which may take 5 or more hours to complete :,(

effective_cache = 17
enable_seqscan = on
enable _bitmapscan = on


 QUERY PLAN 
  

 






-
 Limit  (cost=27674.12..27674.21 rows=1 width=8)
   -  Subquery Scan people_consent  (cost=27674.12..27978.41 rows=3121
width=8)
 -  Unique  (cost=27674.12..27947.20 rows=3121 width=816)
   -  Sort  (cost=27674.12..27681.92 rows=3121 width=816)
 Sort Key: id, firstname, lastname, homephone,
workphone, al
tphone, eligibilityzipcode, address1, address2, city, state, zipcode1,
zipcode2,
 email, dayofbirth, monthofbirth, yearofbirth, ethnic_detail, external_id,
highe
stlevelofeducation_id, ethnicgroup_id, ethnicotherrace, entered_at,
entered_by, 
besttimetoreach_id, language_id, otherlanguage, gender_id,
hispaniclatino_id, ca

nscheduleapt_id, mayweleaveamessage_id, ethnictribe, ethnicasian,
ethnicislander
 -  Append  (cost=13595.19..27492.98 rows=3121
width=816)
   -  Nested Loop  (cost=13595.19..13602.61 rows=2
widt
h=816)
 -  Unique  (cost=13595.19..13595.20 rows=2
wid
th=8)
   -  Sort  (cost=13595.19..13595.19
rows=2
 width=8)
 Sort Key: temp_consent2.id
 -  Unique 
(cost=13595.14..13595.1
6 rows=2 width=16)
   -  Sort 
(cost=13595.14..135
95.15 rows=2 width=16)
 Sort Key:
temp_consent.
daterecorded, temp_consent.id
 -  Subquery Scan
temp_
consent  (cost=13595.09..13595.13 rows=2 width=16)
   -  Unique 
(cost
=13595.09..13595.11 rows=2 width=36)
 - 
Sort  (
cost=13595.09..13595.10 rows=2 width=36)
  
Sort 
Key: id, daterecorded, answer

  
-  A
ppend  (cost=13506.81..13595.08 rows=2 width=36)

 -  HashAggregate  (cost=13506.81..13506.83 rows=1 width=36)

   -  Nested Loop  (cost=58.47..13506.81 rows=1 width=36)

 -  Nested Loop  (cost=58.47..13503.10 rows=1 width=36)

   -  Nested Loop  (cost=58.47..13499.67 rows=1 width=24)

 -  Nested Loop  (cost=58.47..13496.64 rows=1
width=24)

   Join Filter: (inner.question_answer_id =
outer
.id)

   -  Nested Loop  (cost=58.47..78.41 rows=1
width=
28)

 -  Index Scan using answers_answer_un
on a
nswers a  (cost=0.00..4.01 rows=1 width=28)

   Index Cond: ((answer)::text =
'Yes'::
text)

 -  Bitmap Heap Scan on
questions_answers q
a  (cost=58.47..74.30 rows=8 width=16)

   Recheck Cond: ((qa.answer_id =
outer
.id) AND (((qa.question_tag)::text = 'consentTransfer'::text) OR
((qa.question_
tag)::text = 'shareWithEval'::text)))

  

Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread Richard Huxton

smiley2211 wrote:

Here is the EXPLAIN after I changed some conf file - now I am running another
EXPLAIN ANALYZE which may take 5 or more hours to complete :,(

effective_cache = 17


Why has effective_cache changed from 80,000 to 170,000 - have you 
stopped running some other application?



enable_seqscan = on
enable _bitmapscan = on


Why were these disabled before? What were you trying to achieve? What 
has now changed?


 QUERY PLAN  


You still haven't supplied the query. However, looking at the explain 
I'd guess there's a lot of sorting going on? You might want to increase 
work_mem just for this query:


SET work_mem = ...;
SELECT ...

However, that's just a blind guess because you haven't supplied the 
absolutely vital information:

1. The query
2. An idea of how many rows are in the relevant tables
3. The I have vacuumed and analysed recently disclaimer
4. The explain analyse (which you are running - good, make sure you save 
a copy of it somwhere).


Even then it'll be difficult to get a quick answer because it looks like 
a large query. So - you can speed things along by looking for oddities 
yourself.


The explain analyse will have two values for rows on each line, the 
predicted and the actual - look for where they are wildly different. If 
the planner is expecting 2 matches and seeing 2000 it might make the 
wrong choice. You can usually cut down the large query to test just this 
section. Then you might want to read up about ALTER TABLE ... SET 
STATISTICS - that might give the planner more to work with.


The other thing to look for is the time. The explain analyse has two 
figures for actual time. These are startup and total time for that 
node (if loops is  1 then multiply the time by the number of loop 
iterations). It might be there are one or two nodes that are taking a 
long time and we can find out why then.


HTH

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread smiley2211

Here are the VIEWS in question: query = (explain analyze select id from
people_consent LIMIT 1;)

CREATE OR REPLACE VIEW temp_consent AS 
 SELECT p.id, max(en.enrolled_at) AS daterecorded, a.answer
   FROM people p, enrollments en, encounters ec,
encounters_questions_answers eqa, questions_answers qa, answers a
  WHERE (qa.question_tag::text = 'consentTransfer'::text OR
qa.question_tag::text = 'shareWithEval'::text) AND eqa.question_answer_id =
qa.id AND ec.id = eqa.encounter_id AND ec.enrollment_id = en.id AND p.id =
en.person_id AND qa.answer_id = a.id
  GROUP BY p.id, a.answer
UNION 
 SELECT p.id, max(c.entered_at) AS daterecorded, a.answer
   FROM people p, ctccalls c, ctccalls_questions_answers cqa,
questions_answers qa, answers a
  WHERE (qa.question_tag::text = 'consentTransfer'::text OR
qa.question_tag::text = 'shareWithEval'::text) AND cqa.question_answer_id =
qa.id AND c.id = cqa.call_id AND p.id = c.person_id AND qa.answer_id = a.id
  GROUP BY p.id, a.answer;


CREATE OR REPLACE VIEW temp_consent2 AS 
 SELECT DISTINCT temp_consent.id, temp_consent.daterecorded
   FROM temp_consent
  WHERE temp_consent.answer::text = 'Yes'::text
  ORDER BY temp_consent.daterecorded DESC, temp_consent.id;


CREATE OR REPLACE VIEW people_consent AS 
 SELECT people.id, people.firstname, people.lastname, people.homephone,
people.workphone, people.altphone, people.eligibilityzipcode,
people.address1, people.address2, people.city, people.state,
people.zipcode1, people.zipcode2, people.email, people.dayofbirth,
people.monthofbirth, people.yearofbirth, people.ethnic_detail,
people.external_id, people.highestlevelofeducation_id,
people.ethnicgroup_id, people.ethnicotherrace, people.entered_at,
people.entered_by, people.besttimetoreach_id, people.language_id,
people.otherlanguage, people.gender_id, people.hispaniclatino_id,
people.canscheduleapt_id, people.mayweleaveamessage_id, people.ethnictribe,
people.ethnicasian, people.ethnicislander
   FROM people
  WHERE (people.id IN ( SELECT temp_consent2.id
   FROM temp_consent2))
UNION 
 SELECT people.id, '***MASKED***' AS firstname, '***MASKED***' AS lastname,
'***MASKED***' AS homephone, '***MASKED***' AS workphone, '***MASKED***' AS
altphone, '***MASKED***' AS eligibilityzipcode, '***MASKED***' AS address1,
'***MASKED***' AS address2, '***MASKED***' AS city, '***MASKED***' AS state,
'***MASKED***' AS zipcode1, '***MASKED***' AS zipcode2, people.email,
'***MASKED***' AS dayofbirth, '***MASKED***' AS monthofbirth, '***MASKED***'
AS yearofbirth, people.ethnic_detail, people.external_id,
people.highestlevelofeducation_id, people.ethnicgroup_id,
people.ethnicotherrace, people.entered_at, people.entered_by,
people.besttimetoreach_id, people.language_id, people.otherlanguage,
people.gender_id, people.hispaniclatino_id, people.canscheduleapt_id,
people.mayweleaveamessage_id, people.ethnictribe, people.ethnicasian,
people.ethnicislander
   FROM people
  WHERE NOT (people.id IN ( SELECT temp_consent2.id
   FROM temp_consent2));

-- 
View this message in context: 
http://www.nabble.com/Query-is-taking-5-HOURS-to-Complete-on-8.1-version-tf4019778.html#a11418991
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread Chris Hoover

On 7/3/07, smiley2211 [EMAIL PROTECTED] wrote:


CREATE OR REPLACE VIEW temp_consent2 AS
SELECT DISTINCT temp_consent.id, temp_consent.daterecorded
   FROM temp_consent
  WHERE temp_consent.answer::text = 'Yes'::text
  ORDER BY temp_consent.daterecorded DESC, temp_consent.id;




Get rid of the order by on this view.  It is  a waste of resources.  If you
need it ordered else where, order it on the fly i.e. select * from
temp_consent2 order by .

CREATE OR REPLACE VIEW people_consent AS

SELECT people.id, people.firstname, people.lastname, people.homephone,
people.workphone, people.altphone, people.eligibilityzipcode,
people.address1, people.address2, people.city, people.state,
people.zipcode1, people.zipcode2, people.email, people.dayofbirth,
people.monthofbirth, people.yearofbirth, people.ethnic_detail,
people.external_id, people.highestlevelofeducation_id,
people.ethnicgroup_id, people.ethnicotherrace, people.entered_at,
people.entered_by, people.besttimetoreach_id, people.language_id,
people.otherlanguage, people.gender_id, people.hispaniclatino_id,
people.canscheduleapt_id, people.mayweleaveamessage_id, people.ethnictribe
,
people.ethnicasian, people.ethnicislander
   FROM people
  WHERE (people.id IN ( SELECT temp_consent2.id
   FROM temp_consent2))
UNION
SELECT people.id, '***MASKED***' AS firstname, '***MASKED***' AS lastname,
'***MASKED***' AS homephone, '***MASKED***' AS workphone, '***MASKED***'
AS
altphone, '***MASKED***' AS eligibilityzipcode, '***MASKED***' AS
address1,
'***MASKED***' AS address2, '***MASKED***' AS city, '***MASKED***' AS
state,
'***MASKED***' AS zipcode1, '***MASKED***' AS zipcode2, people.email,
'***MASKED***' AS dayofbirth, '***MASKED***' AS monthofbirth,
'***MASKED***'
AS yearofbirth, people.ethnic_detail, people.external_id,
people.highestlevelofeducation_id, people.ethnicgroup_id,
people.ethnicotherrace, people.entered_at, people.entered_by,
people.besttimetoreach_id, people.language_id, people.otherlanguage,
people.gender_id, people.hispaniclatino_id, people.canscheduleapt_id,
people.mayweleaveamessage_id, people.ethnictribe, people.ethnicasian,
people.ethnicislander
   FROM people
  WHERE NOT (people.id IN ( SELECT temp_consent2.id
   FROM temp_consent2));




Try linking the people and temp_consent2 like this
where people.id not in (select temp_consent2.id from temp_consent2 where
temp_consent2.id = people.id)

That will help a lot.

HTH,

Chris


Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread smiley2211

TOP shows CPU at 100% while executed the EXPLAIN ANALYZE...what does this
mean?

17519 postgres  25   0 3470m  43m  39m R  100  0.3  28:50.53 postmaster 
-- 
View this message in context: 
http://www.nabble.com/Query-is-taking-5-HOURS-to-Complete-on-8.1-version-tf4019778.html#a11419885
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread Richard Huxton

smiley2211 wrote:

TOP shows CPU at 100% while executed the EXPLAIN ANALYZE...what does this
mean?

17519 postgres  25   0 3470m  43m  39m R  100  0.3  28:50.53 postmaster 


It means it's busy. Probably sorting/eliminating duplicates (see my 
answer posted just before this one).


Keep an eye on vmstat too and see if there's much disk activity.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query is taking 5 HOURS to Complete on 8.1 version

2007-07-03 Thread Richard Huxton

smiley2211 wrote:

Here are the VIEWS in question: query = (explain analyze select id from
people_consent LIMIT 1;)


First thing I notice - you don't have any ordering, so the LIMIT isn't 
returning a well-defined record. Might not matter in your particular 
context.


CREATE OR REPLACE VIEW temp_consent AS 
 SELECT p.id, max(en.enrolled_at) AS daterecorded, a.answer

   FROM people p, enrollments en, encounters ec,
encounters_questions_answers eqa, questions_answers qa, answers a
  WHERE (qa.question_tag::text = 'consentTransfer'::text OR
qa.question_tag::text = 'shareWithEval'::text) AND eqa.question_answer_id =
qa.id AND ec.id = eqa.encounter_id AND ec.enrollment_id = en.id AND p.id =
en.person_id AND qa.answer_id = a.id
  GROUP BY p.id, a.answer
UNION 


I think you might be able to make this UNION ALL - a UNION will check 
for duplicates and eliminate them. That's a match on 
(id,daterecorded,answer) from both sub-queries - can that happen and do 
you care?



 SELECT p.id, max(c.entered_at) AS daterecorded, a.answer
   FROM people p, ctccalls c, ctccalls_questions_answers cqa,
questions_answers qa, answers a
  WHERE (qa.question_tag::text = 'consentTransfer'::text OR
qa.question_tag::text = 'shareWithEval'::text) AND cqa.question_answer_id =
qa.id AND c.id = cqa.call_id AND p.id = c.person_id AND qa.answer_id = a.id
  GROUP BY p.id, a.answer;


CREATE OR REPLACE VIEW temp_consent2 AS 
 SELECT DISTINCT temp_consent.id, temp_consent.daterecorded

   FROM temp_consent
  WHERE temp_consent.answer::text = 'Yes'::text
  ORDER BY temp_consent.daterecorded DESC, temp_consent.id;


Not sure what the DISTINCT is doing for us here. You've eliminated 
duplicates in the previous view and so you can't have more than one 
(id,daterecorded) for any given answer. (Assuming you keep the previous 
UNION in)


CREATE OR REPLACE VIEW people_consent AS 
 SELECT people.id, people.firstname, people.lastname, people.homephone,

people.workphone, people.altphone, people.eligibilityzipcode,
people.address1, people.address2, people.city, people.state,
people.zipcode1, people.zipcode2, people.email, people.dayofbirth,
people.monthofbirth, people.yearofbirth, people.ethnic_detail,
people.external_id, people.highestlevelofeducation_id,
people.ethnicgroup_id, people.ethnicotherrace, people.entered_at,
people.entered_by, people.besttimetoreach_id, people.language_id,
people.otherlanguage, people.gender_id, people.hispaniclatino_id,
people.canscheduleapt_id, people.mayweleaveamessage_id, people.ethnictribe,
people.ethnicasian, people.ethnicislander
   FROM people
  WHERE (people.id IN ( SELECT temp_consent2.id
   FROM temp_consent2))
UNION 
 SELECT people.id, '***MASKED***' AS firstname, '***MASKED***' AS lastname,

'***MASKED***' AS homephone, '***MASKED***' AS workphone, '***MASKED***' AS
altphone, '***MASKED***' AS eligibilityzipcode, '***MASKED***' AS address1,
'***MASKED***' AS address2, '***MASKED***' AS city, '***MASKED***' AS state,
'***MASKED***' AS zipcode1, '***MASKED***' AS zipcode2, people.email,
'***MASKED***' AS dayofbirth, '***MASKED***' AS monthofbirth, '***MASKED***'
AS yearofbirth, people.ethnic_detail, people.external_id,
people.highestlevelofeducation_id, people.ethnicgroup_id,
people.ethnicotherrace, people.entered_at, people.entered_by 
people.besttimetoreach_id, people.language_id, people.otherlanguage,
people.gender_id, people.hispaniclatino_id, people.canscheduleapt_id,
people.mayweleaveamessage_id, people.ethnictribe, people.ethnicasian,
people.ethnicislander
   FROM people
  WHERE NOT (people.id IN ( SELECT temp_consent2.id
   FROM temp_consent2));


OK, well the UNION here can certainly be UNION ALL.
1. You're using ***MASKED*** for a bunch of fields, so unless they're 
occurring naturally in people you won't get duplicates.

2. Your WHERE clauses are the complement of each other.

One other point NOT (people.id IN...) would perhaps be usually written 
as people.id NOT IN (...). The planner should realise they're the same 
though.


However, there's one obvious thing you can do. As it stands you're 
testing against temp_consent2 twice. You could rewrite the query 
something like:


SELECT
  people.id,
  CASE WHEN temp_consent2.id IS NULL
THEN '***MASKED***'
ELSE people.firstname
  END AS firstname
  ...
FROM
  people LEFT JOIN temp_consent2 ON people.id=temp_consent2.id
;

You might want to try these tweaks, but I'd start by working with 
temp_consent and seeing how long that takes to execute. Then work out.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Filesystem Direct I/O and WAL sync option

2007-07-03 Thread Gregory Stark

Dimitri [EMAIL PROTECTED] writes:

 Yes, disk drives are also having cache disabled or having cache on
 controllers and battery protected (in case of more  high-level
 storage) - but is it enough to expect data consistency?... (I was
 surprised about checkpoint sync, but does it always calls write()
 anyway? because in this way it should work without fsync)...

Well if everything is mounted in sync mode then I suppose you have the same
guarantee as if fsync were called after every single write. If that's true
then surely that's at least as good. I'm curious how it performs though.

Actually it seems like in that configuration fsync should be basically
zero-cost. In other words, you should be able to leave fsync=on and get the
same performance (whatever that is) and not have to worry about any risks.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster