Re: [GENERAL] Help with slow table update

2015-04-19 Thread Tim Uckun
On Sat, Apr 18, 2015 at 10:24 AM, Pawel Veselov pawel.vese...@gmail.com
wrote:

 I found some dangling prepared transactions



How do you find and remove these?


Re: [GENERAL] Help with slow table update

2015-04-19 Thread Jim Nasby

On 4/19/15 9:53 PM, Tim Uckun wrote:


On Sat, Apr 18, 2015 at 10:24 AM, Pawel Veselov pawel.vese...@gmail.com
mailto:pawel.vese...@gmail.com wrote:

I found some dangling prepared transactions



How do you find and remove these?


SELECT * FROM pg_prepared_xacts;
ROLLBACK PREPARED xid;
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with slow table update

2015-04-17 Thread Pawel Veselov

 [skipped]


  But remember that if you update or delete a row, removing it from an
 index, the data will stay in that index until vacuum comes along.

 Also, there's no point in doing a REINDEX after a VACUUM FULL;
 vacuum full rebuilds all the indexes for you.


 I was being desperate :)

 I still think there is something very wrong with this particular table.
 First, I have production systems that employ this function on way larger
 data set, and there is no problem (so far, but still). This machine is
 part of a test deployment, there is no constant load, the only data that
 is being written now is when I do these tests. Vacuuming should prune
 all that dead stuff, and if it's absent, it's unclear where is the time
 spent navigating/updating the table with 24 rows :)


 I think you definitely have a problem with dead rows, as evidenced by the
 huge improvement VACUUM FULL made.


 But it's not clear why (and not reasonable, IMHO, that) it wouldn't
 improve past current point.


What I should've done is 'VACUUM FULL VERBOSE'. Once I did, it told me
there were 800k dead rows that can't be removed. After digging around I
found some dangling prepared transactions, going back months. Once I threw
those away, and re-vacuumed, things got back to normal.

Thanks for all your help and advice.


Re: [GENERAL] Help with slow table update

2015-04-15 Thread Igor Neyman


From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Pawel Veselov
Sent: Tuesday, April 14, 2015 8:01 PM
To: Jim Nasby
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Help with slow table update

[skipped]

This is where using sets becomes really tedious, as Postgres severely lacks an 
upsert-like statement.
I don't think there are joins allowed in UPDATE statement, so I will need to 
use WITH query, right?
Also, I'm not sure how LEFT JOIN will help me isolate and insert missed 
entries...

Would it be OK to replace upsert part with merging into a temp table, then 
deleting and inserting from temp table? Is there any penalty for insert/delete 
comparing to update?

[skipped]

Yes, you can do UPDATE with joins 
(http://www.postgresql.org/docs/9.4/static/sql-update.html) like this:

UPDATE table1 A SET col1 = B.col2
  FROM table2 B
  WHERE A.col3 = B.col4;

Regards,
Igor Neyman


Re: [GENERAL] Help with slow table update

2015-04-15 Thread Pawel Veselov

 [skipped]



 This is where using sets becomes really tedious, as Postgres severely
 lacks an upsert-like statement.

 I don't think there are joins allowed in UPDATE statement, so I will need
 to use WITH query, right?

 Also, I'm not sure how LEFT JOIN will help me isolate and insert missed
 entries...



   [skipped]



 Yes, you can do UPDATE with joins (
 http://www.postgresql.org/docs/9.4/static/sql-update.html) like this:



 UPDATE table1 A SET col1 = B.col2

   FROM table2 B

   WHERE A.col3 = B.col4;



I meant using JOIN operator in the update. But it's still possible, though
through WITH query.


Re: [GENERAL] Help with slow table update

2015-04-14 Thread Pawel Veselov
On Mon, Apr 13, 2015 at 6:03 PM, David G. Johnston 
david.g.johns...@gmail.com wrote:

 On Mon, Apr 13, 2015 at 5:01 PM, Pawel Veselov pawel.vese...@gmail.com
 wrote:


 r_agrio_hourly - good, r_agrio_total - bad.

  Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329) (actual
 time=2.248..2.248 rows=0 loops=1)
-  Index Scan using u_r_agrio_hourly on r_agrio_hourly
  (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207 rows=1
 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
 (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 2.281 ms
  Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
 (actual time=106.766..106.766 rows=0 loops=1)
-  Index Scan using u_r_agrio_total on r_agrio_total
  (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
 rows=1 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 106.793 ms


 What it is you expect to see here?

 ​What are the results (count and times) for:

 SELECT count(*) FROM r_agrio_total WHERE tagid = 1002 and unitid = 1002;


Result: 8 (the whole table is 24 rows). It returns somewhat with a stumble,
but relatively quickly.
db= explain analyze SELECT count(*) FROM r_agrio_total WHERE tagid = 1002
and unitid = 1002;
   QUERY PLAN

-
 Aggregate  (cost=4.45..4.46 rows=1 width=0) (actual time=327.194..327.195
rows=1 loops=1)
   -  Index Scan using tag_r_agrio_total on r_agrio_total
 (cost=0.42..4.45 rows=1 width=0) (actual time=0.039..327.189 rows=8
loops=1)
 Index Cond: (tagid = 1002::numeric)
 Filter: (unitid = 1002::numeric)
 Total runtime: 327.228 ms


 SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002 and unitid = 1002;


Result is 2869. Returns somewhat quckly. Explain analyze is crazy though:
db= explain analyze SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002
and unitid = 1002;

 QUERY PLAN


 Aggregate  (cost=68134.68..68134.69 rows=1 width=0) (actual
time=15177.211..15177.211 rows=1 loops=1)
   -  Index Scan using adunit_r_agrio_hourly on r_agrio_hourly
 (cost=0.42..67027.10 rows=443035 width=0) (actual time=0.096..15175.730
rows=2869 loops=1)
 Index Cond: (unitid = 1002::numeric)
 Filter: (tagid = 1002::numeric)
 Total runtime: 15177.240 ms

​More queries along this line might be needed.  The underlying question is
 how many index rows need to be skipped over on total to get the final
 result - or rather are the columns in the index in descending order of
 cardinality?


Idea is - both tables have unique multi-field indices, and each update hits
exactly one row from that index, no more, and all fields from the index are
locked with equality condition on the update. All of the updates (within a
transaction) would always work on a small subset of rows (max a few
hundred, ever; in this case, it's may be around 10). I expect it to be
possible for the server to keep the active working set in the cache at all
times. Since the index is unique, there shouldn't be a reason to re-scan
the table, if a cached row is found, no?


 Any chance you can perform a REINDEX - maybe there is some bloat
 present?  There are queries to help discern if that may be the case, I do
 not know then off the top of my head, but just doing it might be acceptable
 and is definitely quicker if so.


That's the thing - I've done both vacuum full, and re-index. The very first
time I did vacuum full things improved (60 seconds to 7 seconds). Re-index
didn't improve anything (but it was done after vacuum full).


 ​I'm still not really following your presentation but maybe my thoughts
 will spark something.​


Thank you! I hope I clarified this some :)


Re: [GENERAL] Help with slow table update

2015-04-14 Thread Pawel Veselov
On Mon, Apr 13, 2015 at 7:37 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 4/13/15 7:01 PM, Pawel Veselov wrote:

 Cursors tend to make things slow. Avoid them if you can.


 Is there an alternative to iterating over a number of rows, where a
 direct update query is not an option?

 I really doubt that either the actual processing logic, including use of
 types has anything to do with my problem. This is based on the fact that
 out of the tables that are being changed, only one is exhibiting the
 problem. All of the involved tables have nearly the same structure, and
 have the same logical operations performed on them. I thought may be the
 bad table is slow because it was first in the list, and Postgres was
 caching the functions results, but I moved things around, and pattern is
 the same.


 I'm guessing that you're essentially processing a queue. Take a look at
 http://www.postgresql.org/message-id/552c750f.2010...@bluetreble.com for
 some ideas. Basically, not only do cursors have non-trivial overhead, doing
 a ton of single-row queries is going to have a non-trivial overhead itself.


Thank you for the pointers. PgQ sounds interesting, it has to be remote for
RDS (I use RDS), but I'll try implementing a solution based on it.
However, for all the times that is being spent during this update, the
breakdown is:

update total table: 10.773033
update hourly table: 00.179711
update daily table: 01.082467
update some other table (actually, it has cardinality similar to total
table): 00.168287
clean the queue table: 00.021733
overhead: 00.014922

The overhead is time taken to run the whole procedure, minus all these
other times that have been counted.

(some notes about the daily table below)


  As for your specific question, I suggest you modify the plpgsql
 function so that it's doing an EXPLAIN ANALYZE on the slow table.
 EXPLAIN ANALYZE actually returns a recordset the same way a SELECT
 would, with a single column of type text. So you just need to do
 something with that output. The easiest thing would be to replace
 this in your function:

 UPDATE slow_table SET ...

 to this (untested)

 RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...

 and change the function so it returns SETOF text instead of whatever
 it returns now.


 Thank you, that made it a lot easier to see into what's really going on.
 But the outcome is somewhat the same. The bad table analysis shows a
 very high cost, and thousands of rows, where the table contains only 24
 rows. This time, however, the actual run time is shown, and one can see
 where the time is spent (I was using just a sum of clock_time()s around
 the update statements to see where the problem is).

 r_agrio_hourly - good, r_agrio_total - bad.

   Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329)
 (actual time=2.248..2.248 rows=0 loops=1)
   -  Index Scan using u_r_agrio_hourly on r_agrio_hourly
   (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207
 rows=1 loops=1)
   Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
 (device_type = 3::numeric) AND (placement = 2::numeric))
   Total runtime: 2.281 ms
   Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
 (actual time=106.766..106.766 rows=0 loops=1)
   -  Index Scan using u_r_agrio_total on r_agrio_total
   (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
 rows=1 loops=1)
   Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND (device_type = 3::numeric) AND (placement =
 2::numeric))
   Total runtime: 106.793 ms


 Keep in mind that the estimated cost is not terribly useful; it's the
 actual times that matter.

 I suspect what's happening here is a combination of things. First, the
 hourly table is basically living in cache, but the total table is not. That
 means that when you go to find a row in the total table you're actually
 hitting the disk instead of pulling the data from memory.



 Second, you may have a lot of dead rows in the total table. I suspect this
 because of the very large amount of time the index scan is taking. Unless
 you're running on an old 10MB MFM drive you'd be pretty hard pressed for
 even 2 IO operations (one for the index leaf page and one for the heap
 page) to take 32ms. I suspect the index scan is having to read many dead
 rows in before it finds a live one, and incurring multiple IOs. Swiching to
 EXPLAIN (analyze, buffers) would help confirm that.


That looks most likely to me as well. Most of the updates in a single
batch, for the total table would be on the same record, while for hourly
table it's a lot less. Logically, the tables contain identical data, except
that hourly table breaks it down per hour, and total table contains the
data for all times. The daily table contains the same data per day.

So, if I compared the tables, the total table has the 

Re: [GENERAL] Help with slow table update

2015-04-14 Thread Pawel Veselov
On Tue, Apr 14, 2015 at 3:29 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 4/14/15 4:44 PM, Pawel Veselov wrote:

 On Tue, Apr 14, 2015 at 1:15 PM, Jim Nasby jim.na...@bluetreble.com
 mailto:jim.na...@bluetreble.com wrote:

 On 4/14/15 1:28 PM, Pawel Veselov wrote:


 I wonder if what I need to do, considering that I update a lot
 of the
 same rows as I process this queue, is to create a temp table,
 update
 the rows there, and then update the actual tables once at the
 end...


 That's what I'd do.


 Well, in short, I changed (repeat the body of loop for how many tables
 are there)

 LOOP (item)
UPDATE table with item
IF not found INSERT item INTO table; END IF;
 END LOOP;

 to:

 CREATE TEMP TABLE xq_table (like table) on commit drop;
 LOOP (item)
LOOP
  UPDATE xq_table with item;
  exit when found;
  INSERT INTO xq_table select * from table for update;
  continue when found;
  INSERT item INTO xq_table;
  exit;
END LOOP;
 END LOOP;
 UPDATE table a set (rows) = (xq.rows)
FROM xq_table xq
WHERE (a.keys) = (xq.keys)

 That works significantly faster. The final update statement is very
 fast. The process is somewhat slow in the beginning as it sucks in
 records from total into xq_total, but once all of that is moved into
 the temp table, it rushes through the rest.


 Databases like to think in sets. It will generally be more efficient to do
 set operations instead of a bunch of row-by-row stuff.

 Since you're pulling all of this from some other table your best bet is
 probably something like:

 CREATE TEMP TABLE raw AS DELETE FROM queue WHERE ... RETURNING *;

 CREATE TEMP VIEW hourly_v AS SELECT ... FROM raw GROUP BY;
 UPDATE ar_hourly SET ... FROM hourly_v JOIN ...;
 INSERT INTO ar_hourly SELECT FROM hourly_v LEFT JOIN ar_hourly ON ...;

 -- Same thing for daily
 -- Same thing for total


In my previous post, there was a problem with that pseudo-code, as it's
missing inserts into the final table at the end of loop, for those records
that need to be inserted and not updated.

This is where using sets becomes really tedious, as Postgres severely lacks
an upsert-like statement.
I don't think there are joins allowed in UPDATE statement, so I will need
to use WITH query, right?
Also, I'm not sure how LEFT JOIN will help me isolate and insert missed
entries...

Would it be OK to replace upsert part with merging into a temp table, then
deleting and inserting from temp table? Is there any penalty for
insert/delete comparing to update?

[skipped]


  But remember that if you update or delete a row, removing it from an
 index, the data will stay in that index until vacuum comes along.

 Also, there's no point in doing a REINDEX after a VACUUM FULL;
 vacuum full rebuilds all the indexes for you.


 I was being desperate :)

 I still think there is something very wrong with this particular table.
 First, I have production systems that employ this function on way larger
 data set, and there is no problem (so far, but still). This machine is
 part of a test deployment, there is no constant load, the only data that
 is being written now is when I do these tests. Vacuuming should prune
 all that dead stuff, and if it's absent, it's unclear where is the time
 spent navigating/updating the table with 24 rows :)


 I think you definitely have a problem with dead rows, as evidenced by the
 huge improvement VACUUM FULL made.


But it's not clear why (and not reasonable, IMHO, that) it wouldn't improve
past current point.


Re: [GENERAL] Help with slow table update

2015-04-14 Thread Pawel Veselov
On Tue, Apr 14, 2015 at 1:15 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 4/14/15 1:28 PM, Pawel Veselov wrote:


 I wonder if what I need to do, considering that I update a lot of the
 same rows as I process this queue, is to create a temp table, update
 the rows there, and then update the actual tables once at the end...


 That's what I'd do.


Well, in short, I changed (repeat the body of loop for how many tables are
there)

LOOP (item)
  UPDATE table with item
  IF not found INSERT item INTO table; END IF;
END LOOP;

to:

CREATE TEMP TABLE xq_table (like table) on commit drop;
LOOP (item)
  LOOP
UPDATE xq_table with item;
exit when found;
INSERT INTO xq_table select * from table for update;
continue when found;
INSERT item INTO xq_table;
exit;
  END LOOP;
END LOOP;
UPDATE table a set (rows) = (xq.rows)
  FROM xq_table xq
  WHERE (a.keys) = (xq.keys)

That works significantly faster. The final update statement is very fast.
The process is somewhat slow in the beginning as it sucks in records from
total into xq_total, but once all of that is moved into the temp table,
it rushes through the rest.


 The other option would be to use a constraint trigger paired with a
 per-row trigger on the hourly table to drive the daily table, and on the
 daily table to drive the total table. The way that would work is the
 per-row table would simply keep track of all the unique records that were
 changed in a statement (presumably by putting them in a temp table). Once
 the statement is done, the constraint trigger would fire; it would
 summarize all the changed data and do a much smaller number of updates to
 the table being summarized into.


I'm not sure how I would be able to avoid the same number of changes on the
total table, trigger would fire on each update, won't it? So, same problem
with a lot of changes on a table...


 BTW, you also made a comment about not having to hit the table if you look
 at something in an index. You can only do that if all the data you need is
 in the index, AND the page with the record is marked as being all-visible
 (google for Postgres Visibility Map). If that's not the case then you still
 have to pull the row in the table in, in order to determine visibility. The
 only case where you can still avoid hitting the table is something like a
 NOT EXISTS; if you can't find any entries in the index for something then
 they definitely won't be in the table.


What I was saying is that if a table has a unique index, and there is
cached fact that a particular index value points to a particular row, there
shouldn't be a need to re-scan the index again to search for any more
matching values (which would be necessary if the index was not unique).
Again, all considering the size of the index, the amount of different index
values that are being queried, etc.


 But remember that if you update or delete a row, removing it from an
 index, the data will stay in that index until vacuum comes along.

 Also, there's no point in doing a REINDEX after a VACUUM FULL; vacuum full
 rebuilds all the indexes for you.


I was being desperate :)

I still think there is something very wrong with this particular table.
First, I have production systems that employ this function on way larger
data set, and there is no problem (so far, but still). This machine is part
of a test deployment, there is no constant load, the only data that is
being written now is when I do these tests. Vacuuming should prune all that
dead stuff, and if it's absent, it's unclear where is the time spent
navigating/updating the table with 24 rows :)


Re: [GENERAL] Help with slow table update

2015-04-14 Thread Jim Nasby

On 4/14/15 1:28 PM, Pawel Veselov wrote:


I wonder if what I need to do, considering that I update a lot of the
same rows as I process this queue, is to create a temp table, update
the rows there, and then update the actual tables once at the end...


That's what I'd do.

The other option would be to use a constraint trigger paired with a 
per-row trigger on the hourly table to drive the daily table, and on the 
daily table to drive the total table. The way that would work is the 
per-row table would simply keep track of all the unique records that 
were changed in a statement (presumably by putting them in a temp 
table). Once the statement is done, the constraint trigger would fire; 
it would summarize all the changed data and do a much smaller number of 
updates to the table being summarized into.


BTW, you also made a comment about not having to hit the table if you 
look at something in an index. You can only do that if all the data you 
need is in the index, AND the page with the record is marked as being 
all-visible (google for Postgres Visibility Map). If that's not the case 
then you still have to pull the row in the table in, in order to 
determine visibility. The only case where you can still avoid hitting 
the table is something like a NOT EXISTS; if you can't find any entries 
in the index for something then they definitely won't be in the table. 
But remember that if you update or delete a row, removing it from an 
index, the data will stay in that index until vacuum comes along.


Also, there's no point in doing a REINDEX after a VACUUM FULL; vacuum 
full rebuilds all the indexes for you.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with slow table update

2015-04-14 Thread Jim Nasby

On 4/14/15 4:44 PM, Pawel Veselov wrote:

On Tue, Apr 14, 2015 at 1:15 PM, Jim Nasby jim.na...@bluetreble.com
mailto:jim.na...@bluetreble.com wrote:

On 4/14/15 1:28 PM, Pawel Veselov wrote:


I wonder if what I need to do, considering that I update a lot
of the
same rows as I process this queue, is to create a temp table,
update
the rows there, and then update the actual tables once at the end...


That's what I'd do.


Well, in short, I changed (repeat the body of loop for how many tables
are there)

LOOP (item)
   UPDATE table with item
   IF not found INSERT item INTO table; END IF;
END LOOP;

to:

CREATE TEMP TABLE xq_table (like table) on commit drop;
LOOP (item)
   LOOP
 UPDATE xq_table with item;
 exit when found;
 INSERT INTO xq_table select * from table for update;
 continue when found;
 INSERT item INTO xq_table;
 exit;
   END LOOP;
END LOOP;
UPDATE table a set (rows) = (xq.rows)
   FROM xq_table xq
   WHERE (a.keys) = (xq.keys)

That works significantly faster. The final update statement is very
fast. The process is somewhat slow in the beginning as it sucks in
records from total into xq_total, but once all of that is moved into
the temp table, it rushes through the rest.


Databases like to think in sets. It will generally be more efficient to 
do set operations instead of a bunch of row-by-row stuff.


Since you're pulling all of this from some other table your best bet is 
probably something like:


CREATE TEMP TABLE raw AS DELETE FROM queue WHERE ... RETURNING *;

CREATE TEMP VIEW hourly_v AS SELECT ... FROM raw GROUP BY;
UPDATE ar_hourly SET ... FROM hourly_v JOIN ...;
INSERT INTO ar_hourly SELECT FROM hourly_v LEFT JOIN ar_hourly ON ...;

-- Same thing for daily
-- Same thing for total


The other option would be to use a constraint trigger paired with a
per-row trigger on the hourly table to drive the daily table, and on
the daily table to drive the total table. The way that would work is
the per-row table would simply keep track of all the unique records
that were changed in a statement (presumably by putting them in a
temp table). Once the statement is done, the constraint trigger
would fire; it would summarize all the changed data and do a much
smaller number of updates to the table being summarized into.


I'm not sure how I would be able to avoid the same number of changes on
the total table, trigger would fire on each update, won't it? So, same
problem with a lot of changes on a table...


The difference is that you'd be doing plain INSERTs into a temp table 
and then summarizing that. That's going to be a LOT more efficient than 
a slew of updates on an existing table.



BTW, you also made a comment about not having to hit the table if
you look at something in an index. You can only do that if all the
data you need is in the index, AND the page with the record is
marked as being all-visible (google for Postgres Visibility Map). If
that's not the case then you still have to pull the row in the table
in, in order to determine visibility. The only case where you can
still avoid hitting the table is something like a NOT EXISTS; if you
can't find any entries in the index for something then they
definitely won't be in the table.


What I was saying is that if a table has a unique index, and there is
cached fact that a particular index value points to a particular row,
there shouldn't be a need to re-scan the index again to search for any
more matching values (which would be necessary if the index was not
unique). Again, all considering the size of the index, the amount of
different index values that are being queried, etc.


It still has to rescan because of visibility concerns.


But remember that if you update or delete a row, removing it from an
index, the data will stay in that index until vacuum comes along.

Also, there's no point in doing a REINDEX after a VACUUM FULL;
vacuum full rebuilds all the indexes for you.


I was being desperate :)

I still think there is something very wrong with this particular table.
First, I have production systems that employ this function on way larger
data set, and there is no problem (so far, but still). This machine is
part of a test deployment, there is no constant load, the only data that
is being written now is when I do these tests. Vacuuming should prune
all that dead stuff, and if it's absent, it's unclear where is the time
spent navigating/updating the table with 24 rows :)


I think you definitely have a problem with dead rows, as evidenced by 
the huge improvement VACUUM FULL made.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with slow table update

2015-04-13 Thread David G. Johnston
On Mon, Apr 13, 2015 at 5:01 PM, Pawel Veselov pawel.vese...@gmail.com
wrote:


 r_agrio_hourly - good, r_agrio_total - bad.

  Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329) (actual
 time=2.248..2.248 rows=0 loops=1)
-  Index Scan using u_r_agrio_hourly on r_agrio_hourly
  (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207 rows=1
 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
 AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type =
 3::numeric) AND (placement = 2::numeric))
  Total runtime: 2.281 ms
  Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
 (actual time=106.766..106.766 rows=0 loops=1)
-  Index Scan using u_r_agrio_total on r_agrio_total
  (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
 rows=1 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
 AND (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 106.793 ms


What it is you expect to see here?

​What are the results (count and times) for:

SELECT count(*) FROM r_agrio_total WHERE tagid = 1002 and unitid = 1002;
SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002 and unitid = 1002;

​More queries along this line might be needed.  The underlying question is
how many index rows need to be skipped over on total to get the final
result - or rather are the columns in the index in descending order of
cardinality?

Any chance you can perform a REINDEX - maybe there is some bloat
present?  There are queries to help discern if that may be the case, I do
not know then off the top of my head, but just doing it might be acceptable
and is definitely quicker if so.

​I'm still not really following your presentation but maybe my thoughts
will spark something.​

​David J.
​


Re: [GENERAL] Help with slow table update

2015-04-13 Thread Pawel Veselov
On Sun, Apr 12, 2015 at 5:40 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 4/9/15 6:18 PM, Pawel Veselov wrote:

 Hi.

 I have a plpgsql procedure that updates a few similar tables.
 for some reason, updates on one of the tables take a lot longer the
 updates on the other ones. The difference is, say, 7 seconds vs. 80
 milliseconds.

 the procedure uses cursors and record variables to do the updates. For
 example:

  update r_agrio_total set
unserved = unserved + (agrow-'unserved')::numeric(38),
r_brkconn = mush_brk_conn(r_brkconn, q_item.r_brkconn),
  where
tagid = _tagid and
unitid = (akey-'unitid')::numeric and
placement = (akey-'placement')::numeric and
device_type = (akey-'device_type')::numeric;

 There is another table (xq_agr) that is read record by record, and for
 each of those records, such update is executed.

 I was trying to select analyze the updates to see where the time could
 be spent.
 There are only 24 row in the bad table, and 3,400 rows in good
 table. So, for the bad table, most of the updates will be on the same
 rows. The times were measured on processing 100 original records.

 When I'm analyzing pure update statements, I don't see anything strange.

 bad table: explain analyze update r_agrio_total set unconfirmed =
 unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
 and device_type = 100;

 RESULT:
   Update on r_agrio_total  (cost=0.42..4.46 rows=1 width=321) (actual
 time=0.253..0.253 rows=0 loops=1)
 -  Index Scan using tag_r_agrio_total on r_agrio_total
   (cost=0.42..4.46 rows=1 width=321) (actual time=0.037..0.041 rows=1
 loops=1)
   Index Cond: (tagid = 1000::numeric)
   Filter: ((unitid = 1000::numeric) AND (placement = 0::numeric)
 AND (device_type = 100::numeric))
   Rows Removed by Filter: 7
   Total runtime: 0.282 ms

 good table: explain analyze update r_agrio_hourly set unconfirmed =
 unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
 and device_type = 100 and rowdate = '2015-02-23T13';

 RESULT:
   Update on r_agrio_hourly  (cost=0.42..17.36 rows=6 width=329) (actual
 time=0.102..0.102 rows=0 loops=1)
 -  Index Scan using u_r_agrio_hourly on r_agrio_hourly
   (cost=0.42..17.36 rows=6 width=329) (actual time=0.047..0.048 rows=1
 loops=1)
   Index Cond: ((tagid = 1000::numeric) AND (unitid =
 1000::numeric) AND ((rowdate)::text = '2015-02-23T13'::text) AND
 (device_type = 100::numeric) AND (placement = 0::numeric))
   Total runtime: 0.135 ms

 When I try doing it with WITH statement (really, to apply the actual
 data that the plpgsql function uses), there is something strange in the
 bad table.

 explain analyze
 with SRC as (select * from xq_agr where id = 914830)
  update r_agrio_total set
unconfirmed = unconfirmed +
 (SRC.r_agrio-'unconfirmed')::numeric(38)
  from SRC
  where
tagid = (SRC.r_agrio-'key'-'tagid')::numeric and
unitid = (SRC.r_agrio-'key'-'unit')::numeric and
placement = (SRC.r_agrio-'key'-'placement')::numeric and
device_type = (SRC.r_agrio-'key'-'device_type')::numeric;

 RESULT:
   Update on r_agrio_total  (cost=8.91..32777.51 rows=19331 width=409)
 (actual time=0.107..0.107 rows=0 loops=1)
 CTE src
   -  Index Scan using xq_agr_pkey on xq_agr  (cost=0.42..8.44
 rows=1 width=379) (actual time=0.026..0.027 rows=1 loops=1)
 Index Cond: (id = 914830)
 -  Nested Loop  (cost=0.46..32769.07 rows=19331 width=409) (actual
 time=0.107..0.107 rows=0 loops=1)
   -  CTE Scan on src  (cost=0.00..0.02 rows=1 width=88) (actual
 time=0.032..0.033 rows=1 loops=1)
   -  Index Scan using u_r_agrio_total on r_agrio_total
   (*cost=0.46..32285.78 rows=19331* width=321) (actual time=0.001..0.001

 rows=0 loops=1)
 Index Cond: ((tagid = (((src.r_agrio - 'key'::text) -
 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio - 'key'::text)
 - 'unit'::text))::numeric) AND (device_type = (((src.r_agrio -
 'key'::text) - 'device_type'::text))::numeric) AND (placement =
 (((src.r_agrio - 'key'::text) - 'placement'::text))::numeric))
   Total runtime: 0.155 ms

 explain analyze
 with SRC as (select * from xq_agr where id = 914830)
  update r_agrio_hourly set
unconfirmed = unconfirmed +
 (SRC.r_agrio-'unconfirmed')::numeric(38)
  from SRC
  where
tagid = (SRC.r_agrio-'key'-'tagid')::numeric and
unitid = (SRC.r_agrio-'key'-'unit')::numeric and
placement = (SRC.r_agrio-'key'-'placement')::numeric and
device_type = (SRC.r_agrio-'key'-'device_type')::numeric
 and
rowdate = (SRC.r_agrio-'key'-'rowdate');

 RESULT:
   Update on r_agrio_hourly  (cost=8.91..52.91 rows=20 width=417) (actual
 time=0.123..0.123 rows=0 loops=1)
 CTE src
   -  Index Scan using xq_agr_pkey on xq_agr 

Re: [GENERAL] Help with slow table update

2015-04-13 Thread Jim Nasby

On 4/13/15 7:01 PM, Pawel Veselov wrote:

Cursors tend to make things slow. Avoid them if you can.


Is there an alternative to iterating over a number of rows, where a
direct update query is not an option?

I really doubt that either the actual processing logic, including use of
types has anything to do with my problem. This is based on the fact that
out of the tables that are being changed, only one is exhibiting the
problem. All of the involved tables have nearly the same structure, and
have the same logical operations performed on them. I thought may be the
bad table is slow because it was first in the list, and Postgres was
caching the functions results, but I moved things around, and pattern is
the same.


I'm guessing that you're essentially processing a queue. Take a look at 
http://www.postgresql.org/message-id/552c750f.2010...@bluetreble.com for 
some ideas. Basically, not only do cursors have non-trivial overhead, 
doing a ton of single-row queries is going to have a non-trivial 
overhead itself.



As for your specific question, I suggest you modify the plpgsql
function so that it's doing an EXPLAIN ANALYZE on the slow table.
EXPLAIN ANALYZE actually returns a recordset the same way a SELECT
would, with a single column of type text. So you just need to do
something with that output. The easiest thing would be to replace
this in your function:

UPDATE slow_table SET ...

to this (untested)

RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...

and change the function so it returns SETOF text instead of whatever
it returns now.


Thank you, that made it a lot easier to see into what's really going on.
But the outcome is somewhat the same. The bad table analysis shows a
very high cost, and thousands of rows, where the table contains only 24
rows. This time, however, the actual run time is shown, and one can see
where the time is spent (I was using just a sum of clock_time()s around
the update statements to see where the problem is).

r_agrio_hourly - good, r_agrio_total - bad.

  Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329)
(actual time=2.248..2.248 rows=0 loops=1)
  -  Index Scan using u_r_agrio_hourly on r_agrio_hourly
  (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207
rows=1 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid =
1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
(device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 2.281 ms
  Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
(actual time=106.766..106.766 rows=0 loops=1)
  -  Index Scan using u_r_agrio_total on r_agrio_total
  (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
rows=1 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid =
1002::numeric) AND (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 106.793 ms


Keep in mind that the estimated cost is not terribly useful; it's the 
actual times that matter.


I suspect what's happening here is a combination of things. First, the 
hourly table is basically living in cache, but the total table is not. 
That means that when you go to find a row in the total table you're 
actually hitting the disk instead of pulling the data from memory.


Second, you may have a lot of dead rows in the total table. I suspect 
this because of the very large amount of time the index scan is taking. 
Unless you're running on an old 10MB MFM drive you'd be pretty hard 
pressed for even 2 IO operations (one for the index leaf page and one 
for the heap page) to take 32ms. I suspect the index scan is having to 
read many dead rows in before it finds a live one, and incurring 
multiple IOs. Swiching to EXPLAIN (analyze, buffers) would help confirm 
that.


Third, I think something odd is happening with the update itself. I'm 
pretty sure that the index scan itself is visiting the heap pages, so 
each page should be in shared buffers by the time each tuple hits the 
update node. That makes me wonder what on earth is taking 60ms to update 
the tuple. I suspect it's going into either finding a free buffer to put 
the new tuple on, or waiting to try and extend the relation. Selecting 
ctid from the freshly updated rows and comparing the first number to the 
total number of pages in the heap would show if the new tuples are all 
ending up at the end of the heap.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with slow table update

2015-04-12 Thread Jim Nasby

On 4/9/15 6:18 PM, Pawel Veselov wrote:

Hi.

I have a plpgsql procedure that updates a few similar tables.
for some reason, updates on one of the tables take a lot longer the
updates on the other ones. The difference is, say, 7 seconds vs. 80
milliseconds.

the procedure uses cursors and record variables to do the updates. For
example:

 update r_agrio_total set
   unserved = unserved + (agrow-'unserved')::numeric(38),
   r_brkconn = mush_brk_conn(r_brkconn, q_item.r_brkconn),
 where
   tagid = _tagid and
   unitid = (akey-'unitid')::numeric and
   placement = (akey-'placement')::numeric and
   device_type = (akey-'device_type')::numeric;

There is another table (xq_agr) that is read record by record, and for
each of those records, such update is executed.

I was trying to select analyze the updates to see where the time could
be spent.
There are only 24 row in the bad table, and 3,400 rows in good
table. So, for the bad table, most of the updates will be on the same
rows. The times were measured on processing 100 original records.

When I'm analyzing pure update statements, I don't see anything strange.

bad table: explain analyze update r_agrio_total set unconfirmed =
unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
and device_type = 100;

RESULT:
  Update on r_agrio_total  (cost=0.42..4.46 rows=1 width=321) (actual
time=0.253..0.253 rows=0 loops=1)
-  Index Scan using tag_r_agrio_total on r_agrio_total
  (cost=0.42..4.46 rows=1 width=321) (actual time=0.037..0.041 rows=1
loops=1)
  Index Cond: (tagid = 1000::numeric)
  Filter: ((unitid = 1000::numeric) AND (placement = 0::numeric)
AND (device_type = 100::numeric))
  Rows Removed by Filter: 7
  Total runtime: 0.282 ms

good table: explain analyze update r_agrio_hourly set unconfirmed =
unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
and device_type = 100 and rowdate = '2015-02-23T13';

RESULT:
  Update on r_agrio_hourly  (cost=0.42..17.36 rows=6 width=329) (actual
time=0.102..0.102 rows=0 loops=1)
-  Index Scan using u_r_agrio_hourly on r_agrio_hourly
  (cost=0.42..17.36 rows=6 width=329) (actual time=0.047..0.048 rows=1
loops=1)
  Index Cond: ((tagid = 1000::numeric) AND (unitid =
1000::numeric) AND ((rowdate)::text = '2015-02-23T13'::text) AND
(device_type = 100::numeric) AND (placement = 0::numeric))
  Total runtime: 0.135 ms

When I try doing it with WITH statement (really, to apply the actual
data that the plpgsql function uses), there is something strange in the
bad table.

explain analyze
with SRC as (select * from xq_agr where id = 914830)
 update r_agrio_total set
   unconfirmed = unconfirmed +
(SRC.r_agrio-'unconfirmed')::numeric(38)
 from SRC
 where
   tagid = (SRC.r_agrio-'key'-'tagid')::numeric and
   unitid = (SRC.r_agrio-'key'-'unit')::numeric and
   placement = (SRC.r_agrio-'key'-'placement')::numeric and
   device_type = (SRC.r_agrio-'key'-'device_type')::numeric;

RESULT:
  Update on r_agrio_total  (cost=8.91..32777.51 rows=19331 width=409)
(actual time=0.107..0.107 rows=0 loops=1)
CTE src
  -  Index Scan using xq_agr_pkey on xq_agr  (cost=0.42..8.44
rows=1 width=379) (actual time=0.026..0.027 rows=1 loops=1)
Index Cond: (id = 914830)
-  Nested Loop  (cost=0.46..32769.07 rows=19331 width=409) (actual
time=0.107..0.107 rows=0 loops=1)
  -  CTE Scan on src  (cost=0.00..0.02 rows=1 width=88) (actual
time=0.032..0.033 rows=1 loops=1)
  -  Index Scan using u_r_agrio_total on r_agrio_total
  (*cost=0.46..32285.78 rows=19331* width=321) (actual time=0.001..0.001
rows=0 loops=1)
Index Cond: ((tagid = (((src.r_agrio - 'key'::text) -
'tagid'::text))::numeric) AND (unitid = (((src.r_agrio - 'key'::text)
- 'unit'::text))::numeric) AND (device_type = (((src.r_agrio -
'key'::text) - 'device_type'::text))::numeric) AND (placement =
(((src.r_agrio - 'key'::text) - 'placement'::text))::numeric))
  Total runtime: 0.155 ms

explain analyze
with SRC as (select * from xq_agr where id = 914830)
 update r_agrio_hourly set
   unconfirmed = unconfirmed +
(SRC.r_agrio-'unconfirmed')::numeric(38)
 from SRC
 where
   tagid = (SRC.r_agrio-'key'-'tagid')::numeric and
   unitid = (SRC.r_agrio-'key'-'unit')::numeric and
   placement = (SRC.r_agrio-'key'-'placement')::numeric and
   device_type = (SRC.r_agrio-'key'-'device_type')::numeric and
   rowdate = (SRC.r_agrio-'key'-'rowdate');

RESULT:
  Update on r_agrio_hourly  (cost=8.91..52.91 rows=20 width=417) (actual
time=0.123..0.123 rows=0 loops=1)
CTE src
  -  Index Scan using xq_agr_pkey on xq_agr  (cost=0.42..8.44
rows=1 width=379) (actual time=0.023..0.024 rows=1 loops=1)
Index Cond: (id = 914830)
-  Nested Loop  (cost=0.47..44.47 rows=20 width=417)