Re: [PERFORM] 10x faster sort performance on Skylake CPU vs Ivy Bridge

2017-08-27 Thread Felix Geisendörfer

> On Aug 25, 2017, at 17:07, Tom Lane  wrote:
> 
> =?utf-8?Q?Felix_Geisend=C3=B6rfer?=  writes:
>> I recently came across a performance difference between two machines that 
>> surprised me:
>> ...
>> As you can see, Machine A spends 5889ms on the Sort Node vs 609ms on Machine 
>> B when looking at the "Exclusive" time with explain.depesz.com [3][4]. I.e. 
>> Machine B is ~10x faster at sorting than Machine B (for this particular 
>> query).
> 
> I doubt this is a hardware issue, it's more likely that you're comparing
> apples and oranges.  The first theory that springs to mind is that the
> sort keys are strings and you're using C locale on the faster machine but
> some non-C locale on the slower.  strcoll() is pretty darn expensive
> compared to strcmp() :-(

You're right, that seems to be it.

Machine A was using strcoll() (lc_collate=en_US.UTF-8)
Machine B was using strcmp() (lc_collate=C)

After switching Machine A to use lc_collate=C, I get:

CTE Scan on zulu  (cost=40673.620..40742.300 rows=3434 width=56) (actual 
time=1368.610..1368.698 rows=58 loops=1)
  CTE zulu
  ->  HashAggregate  (cost=40639.280..40673.620 rows=3434 width=56) (actual 
time=1368.607..1368.659 rows=58 loops=1)
  Group Key: mike.two, ((mike.golf)::text)
->  Unique  (cost=37656.690..40038.310 rows=34341 width=104) 
(actual time=958.493..1168.128 rows=298104 loops=1)
  ->  Sort  (cost=37656.690..38450.560 rows=317549 width=104) 
(actual time=958.491..1055.635 rows=316982 loops=1)
  Sort Key: mike.two, ((mike.lima)::text) COLLATE 
"papa", mike.echo DESC, mike.quebec
  Sort Method: quicksort  Memory: 56834kB
->  Seq Scan on mike  (cost=0.000..8638.080 rows=317549 
width=104) (actual time=0.043..172.496 rows=316982 loops=1)
Filter: (golf five NOT NULL)
Rows Removed by Filter: 26426

So Machine A needs 883ms [1] for the sort vs 609ms [2] for Machine B. That's  
~1.4x faster which seems reasonable :).

Sorry for the delayed response, I didn't have access to machine B to confirm 
this right away.

>   regards, tom lane

This is my first post to a PostgreSQL mailing list, but I've been lurking
for a while. Thank you for taking the time for replying to e-mails such
as mine and all the work you've put into PostgreSQL over the years.
I'm deeply grateful.

> On Aug 25, 2017, at 17:43, Peter Geoghegan  wrote:
> 
> On Fri, Aug 25, 2017 at 8:07 AM, Tom Lane  wrote:
>> I doubt this is a hardware issue, it's more likely that you're comparing
>> apples and oranges.  The first theory that springs to mind is that the
>> sort keys are strings and you're using C locale on the faster machine but
>> some non-C locale on the slower.  strcoll() is pretty darn expensive
>> compared to strcmp() :-(
> 
> strcoll() is very noticeably slower on macOS, too.
> 

Thanks. This immediately explains what I saw when testing this query on a Linux 
machine that was also using lc_collate=en_US.UTF-8 but not being slowed down by 
it as much as the macOS machine.

[1] https://explain.depesz.com/s/LOqa
[2] https://explain.depesz.com/s/zVe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] 10x faster sort performance on Skylake CPU vs Ivy Bridge

2017-08-25 Thread Felix Geisendörfer
Hi,

I recently came across a performance difference between two machines that 
surprised me:

Postgres Version / OS on both machines: v9.6.3 / MacOS 10.12.5

Machine A: MacBook Pro Mid 2012, 2.7 GHz Intel Core i7 (Ivy Bridge), 8 MB L3 
Cache, 16 GB 1600 MHz DDR3 [1]
Machine B: MacBook Pro Late 2016, 2.6 GHz Intel Core i7 (Skylake), 6 MB L3 
Cache,16 GB 2133 MHz LPDDR3 [2]

Query Performance on Machine A: [3]

CTE Scan on zulu  (cost=40673.620..40742.300 rows=3434 width=56) (actual 
time=6339.404..6339.462 rows=58 loops=1)
  CTE zulu
  ->  HashAggregate  (cost=40639.280..40673.620 rows=3434 width=31) (actual 
time=6339.400..6339.434 rows=58 loops=1)
  Group Key: mike.two, mike.golf
->  Unique  (cost=37656.690..40038.310 rows=34341 width=64) (actual 
time=5937.934..6143.161 rows=298104 loops=1)
  ->  Sort  (cost=37656.690..38450.560 rows=317549 width=64) 
(actual time=5937.933..6031.925 rows=316982 loops=1)
  Sort Key: mike.two, mike.lima, mike.echo DESC, 
mike.quebec
  Sort Method: quicksort  Memory: 56834kB
->  Seq Scan on mike  (cost=0.000..8638.080 rows=317549 
width=64) (actual time=0.019..142.831 rows=316982 loops=1)
Filter: (golf five NOT NULL)
Rows Removed by Filter: 26426

Query Performance on Machine B: [4]

CTE Scan on zulu  (cost=40621.420..40690.100 rows=3434 width=56) (actual 
time=853.436..853.472 rows=58 loops=1)
  CTE zulu
  ->  HashAggregate  (cost=40587.080..40621.420 rows=3434 width=31) (actual 
time=853.433..853.448 rows=58 loops=1)
  Group Key: mike.two, mike.golf
->  Unique  (cost=37608.180..39986.110 rows=34341 width=64) (actual 
time=634.412..761.678 rows=298104 loops=1)
  ->  Sort  (cost=37608.180..38400.830 rows=317057 width=64) 
(actual time=634.411..694.719 rows=316982 loops=1)
  Sort Key: mike.two, mike.lima, mike.echo DESC, 
mike.quebec
  Sort Method: quicksort  Memory: 56834kB
->  Seq Scan on mike  (cost=0.000..8638.080 rows=317057 
width=64) (actual time=0.047..85.534 rows=316982 loops=1)
Filter: (golf five NOT NULL)
Rows Removed by Filter: 26426

As you can see, Machine A spends 5889ms on the Sort Node vs 609ms on Machine B 
when looking at the "Exclusive" time with explain.depesz.com [3][4]. I.e. 
Machine B is ~10x faster at sorting than Machine B (for this particular query).

My question is: Why?

I understand that this is a 3rd gen CPU vs a 6th gen, and that things have 
gotten faster despite stagnant clock speeds, but seeing a 10x difference still 
caught me off guard.

Does anybody have some pointers to understand where those gains are coming 
from? Is it the CPU, memory, or both? And in particular, why does Sort benefit 
so massively from the advancement here (~10x), but Seq Scan, Unique and 
HashAggregate don't benefit as much (~2x)?

As you can probably tell, my hardware knowledge is very superficial, so I 
apologize if this is a stupid question. But I'd genuinely like to improve my 
understanding and intuition about these things.

Cheers
Felix Geisendörfer

[1] 
http://www.everymac.com/systems/apple/macbook_pro/specs/macbook-pro-core-i7-2.7-15-mid-2012-retina-display-specs.html
[2] 
http://www.everymac.com/systems/apple/macbook_pro/specs/macbook-pro-core-i7-2.6-15-late-2016-retina-display-touch-bar-specs.html
[3] https://explain.depesz.com/s/hmn
[4] https://explain.depesz.com/s/zVe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Slow Performance on a XEON E5504

2012-08-27 Thread Felix Schubert
Hello List,

I've got a system on a customers location which has a XEON E5504 @ 2.00GHz 
Processor (HP Proliant)

It's postgres 8.4 on a Debian Squeeze System running with 8GB of ram:

The Postgres Performance on this system measured with pgbench is very poor:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 40
number of transactions per client: 100
number of transactions actually processed: 4000/4000
tps = 158.283272 (including connections establishing)
tps = 158.788545 (excluding connections establishing)

The same database on a Core i7 CPU 920 @ 2.67GHz, 8 cores with 8GB RAM same 
distro and Postgresql Version is much faster:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 40
number of transactions per client: 100
number of transactions actually processed: 4000/4000
tps = 1040.534002 (including connections establishing)
tps = 1065.215134 (excluding connections establishing)

Even optimizing the postgresql.conf values doesn't change a lot on the tps 
values. (less than 10%)

Tried Postgresql 9.1 on the Proliant:
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 40
number of threads: 1
number of transactions per client: 100
number of transactions actually processed: 4000/4000
tps = 53.114978 (including connections establishing)
tps = 53.198667 (excluding connections establishing)

Next was to compare the diskperformance which was much better on the XEON than 
on the Intel i7.

Any idea where to search for the bottleneck?

Mit freundlichen Grüßen

Felix Schubert

FEScon
... and work flows!

felix schubert
haspelgasse 5
69117 heidelberg

mobil: +49-151-25337718
mail: in...@fescon.de
skype: fesmac



[PERFORM] Slow Performance on a XEON E5504

2012-08-25 Thread Felix Schubert
Hello List,

I've got a system on a customers location which has a XEON E5504 @ 2.00GHz 
Processor (HP Proliant)

It's postgres 8.4 on a Debian Squeeze System running with 8GB of ram:

The Postgres Performance on this system measured with pgbench is very poor:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 40
number of transactions per client: 100
number of transactions actually processed: 4000/4000
tps = 158.283272 (including connections establishing)
tps = 158.788545 (excluding connections establishing)

The same database on a Core i7 CPU 920 @ 2.67GHz, 8 cores with 8GB RAM same 
distro and Postgresql Version is much faster:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 40
number of transactions per client: 100
number of transactions actually processed: 4000/4000
tps = 1040.534002 (including connections establishing)
tps = 1065.215134 (excluding connections establishing)

Even optimizing the postgresql.conf values doesn't change a lot on the tps 
values. (less than 10%)

Tried Postgresql 9.1 on the Proliant:
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 40
number of threads: 1
number of transactions per client: 100
number of transactions actually processed: 4000/4000
tps = 53.114978 (including connections establishing)
tps = 53.198667 (excluding connections establishing)

Next was to compare the diskperformance which was much better on the XEON than 
on the Intel i7.

Any idea where to search for the bottleneck?

best regards,

Felix Schubert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Performance on a XEON E5504

2012-08-25 Thread Felix Schubert
Hi Scott,

the controller is a HP i410 running 3x300GB SAS 15K / Raid 5 

Mit freundlichen Grüßen

Felix Schubert

Von meinem iPhone gesendet :-)

Am 25.08.2012 um 14:42 schrieb Scott Marlowe scott.marl...@gmail.com:

 On Sat, Aug 25, 2012 at 6:07 AM, Felix Schubert in...@fescon.de wrote:
 Hello List,
 
 I've got a system on a customers location which has a XEON E5504 @ 2.00GHz 
 Processor (HP Proliant)
 
 It's postgres 8.4 on a Debian Squeeze System running with 8GB of ram:
 
 The Postgres Performance on this system measured with pgbench is very poor:
 
 transaction type: TPC-B (sort of)
 scaling factor: 1
 query mode: simple
 number of clients: 40
 number of transactions per client: 100
 number of transactions actually processed: 4000/4000
 tps = 158.283272 (including connections establishing)
 tps = 158.788545 (excluding connections establishing)
 
 For a single thread on a 10k RPM drive the maximum number of times per
 second you can write and get a proper fsync back is 166.  This is
 quite close to that theoretical max.
 
 The same database on a Core i7 CPU 920 @ 2.67GHz, 8 cores with 8GB RAM same 
 distro and Postgresql Version is much faster:
 
 transaction type: TPC-B (sort of)
 scaling factor: 1
 query mode: simple
 number of clients: 40
 number of transactions per client: 100
 number of transactions actually processed: 4000/4000
 tps = 1040.534002 (including connections establishing)
 tps = 1065.215134 (excluding connections establishing)
 
 This is much faster than the theoretical limit of a single 10k RPM
 drive obeying fsync.
 
 I'll ignore the rest of your post where you get 53 tps after
 optimization.  The important thing you forgot to mention was your
 drive subsystem here.  I'm gonna take a wild guess that they are both
 on a single drive and that the older machine is using an older SATA or
 PATA interface HD that is lying about fsync, and the new machine is
 using a 10k RPM drive that is not lying about fsync and you are
 getting a proper ~150 tps from it.
 
 So, what kind of IO subsystems you got in those things?
 
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Performance on a XEON E5504

2012-08-25 Thread Felix Schubert
Don't know but I forwarded the question to the System Administrator. 

Anyhow thanks for the information up to now!

best regards,

Felix 

Am 25.08.2012 um 14:59 schrieb Scott Marlowe scott.marl...@gmail.com:

 Well it sounds like it does NOT have a battery back caching module on
 it, am I right?



[PERFORM] queries are fast after dump-restore but slow again after some days dispite vacuum

2012-07-19 Thread Felix Scheicher
Hi, I am running a 9.1 server at Ubuntu. When I upgraded to the current version 
I did a pg_dump followed by pg_restore and found that the db was much faster. 
But slowed down again after two days. I did the dump-restore again and could 
now 
compare the two (actually identical) databases. This is a query of the old one 
directly after a VACUUM ANALYSE:

  QUERY PLAN


 WindowAgg  (cost=2231.56..2232.17 rows=22 width=59) (actual 
time=16748.382..16749.203 rows=340 loops=1)
   -  Sort  (cost=2231.56..2231.62 rows=22 width=59) (actual 
time=16748.360..16748.575 rows=340 loops=1)
 Sort Key: ba.bookid, (CASE WHEN (e.languageid = 123) THEN 1 WHEN 
(e.languageid = 401) THEN 2 WHEN (e.languageid = 150) THEN 3 ELSE 4 END)
 Sort Method: quicksort  Memory: 60kB
 -  Nested Loop  (cost=0.00..2231.07 rows=22 width=59) (actual 
time=0.125..16747.395 rows=340 loops=1)
   -  Index Scan using authorid1 on book_author ba  
(cost=0.00..73.94 rows=20 width=8) (actual time=0.034..11.453 rows=99 loops=1)
 Index Cond: (authorid = 543)
   -  Index Scan using foreign_key_bookid on editions e  
(cost=0.00..107.76 rows=8 width=51) (actual time=90.741..169.031 rows=3 
loops=99)
 Index Cond: (bookid = ba.bookid)
 Filter: mainname
 Total runtime: 16752.146 ms
(11 Zeilen)

And here after dump-restore:

QUERY PLAN  


-
 WindowAgg  (cost=2325.78..2326.41 rows=23 width=58) (actual 
time=18.583..19.387 
rows=340 loops=1)
   -  Sort  (cost=2325.78..2325.84 rows=23 width=58) (actual 
time=18.562..18.823 rows=340 loops=1)
 Sort Key: ba.bookid, (CASE WHEN (e.languageid = 123) THEN 1 WHEN 
(e.languageid = 401) THEN 2 WHEN (e.languageid = 150) THEN 3 ELSE 4 END)
 Sort Method: quicksort  Memory: 60kB
 -  Nested Loop  (cost=0.00..2325.26 rows=23 width=58) (actual 
time=0.385..18.060 rows=340 loops=1)
   -  Index Scan using authorid1 on book_author ba  
(cost=0.00..73.29 rows=20 width=8) (actual time=0.045..0.541 rows=99 loops=1)
 Index Cond: (authorid = 543)
   -  Index Scan using foreign_key_bookid on editions e  
(cost=0.00..112.49 rows=9 width=50) (actual time=0.056..0.168 rows=3 loops=99)
 Index Cond: (bookid = ba.bookid)
 Filter: mainname
 Total runtime: 19.787 ms
(11 Zeilen)

server settings:
shared_buffers = 680MB
work_mem = 10MB
maintenance_work_mem = 64MB
checkpoint_segments = 32
checkpoint_completion_target = 0.9
effective_cache_size = 1500MB

No matter how much I vacuum or analyse the slow db, I don't get it faster.
I also checked for dead tuples - there are none.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] queries are fast after dump-restore but slow again after some days dispite vacuum

2012-07-19 Thread Felix Scheicher
Andrew Dunstan andrew at dunslane.net writes:

 Try running CLUSTER on the relevant tables and see if it makes a 
 difference. If it does you might want to look into using pg_reorg 
 periodically.


That worked like a charm! Many thanks. But how comes, the queries are also fast 
after a restore without the cluster?

regards,
Felix


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] good old VACUUM FULL

2011-03-22 Thread felix
I posted many weeks ago about a severe problem with a table that was
obviously bloated and was stunningly slow. Up to 70 seconds just to get a
row count on 300k rows.

I removed the text column, so it really was just a few columns of fixed
data.
Still very bloated.  Table size was 450M

The advice I was given was to do CLUSTER, but this did not reduce the table
size in the least.
Nor performance.

Also to resize my free space map (which still does need to be done).
Since that involves tweaking the kernel settings, taking the site down and
rebooting postgres and exposing the system to all kinds of risks and
unknowns and expensive experimentations I was unable to do it and have had
to hobble along with a slow table in my backend holding up jobs.

Much swearing that nobody should ever do VACUUM FULL.  Manual advises
against it.  Only crazy people do that.

Finally I decide to stop taking advice.

ns= explain analyze select count(*) from fastadder_fastadderstatus;
-
 Aggregate  (cost=62602.08..62602.09 rows=1 width=0) (actual
time=25320.000..25320.000 rows=1 loops=1)
   -  Seq Scan on fastadder_fastadderstatus  (cost=0.00..61815.86
rows=314486 width=0) (actual time=180.000..25140.000 rows=314493 loops=1)
 Total runtime: *25320.000* ms

ns= vacuum full fastadder_fastadderstatus;

took about 20 minutes

ns= explain analyze select count(*) from fastadder_fastadderstatus;

 Aggregate  (cost=7478.03..7478.04 rows=1 width=0) (actual
time=940.000..940.000 rows=1 loops=1)
   -  Seq Scan on fastadder_fastadderstatus  (cost=0.00..6691.82
rows=314482 width=0) (actual time=0.000..530.000 rows=314493 loops=1)
 Total runtime: *940.000 ms*

moral of the story:  if your table is really bloated, just do VACUUM FULL

CLUSTER will not reduce table bloat in and identical fashion


Re: [PERFORM] Really really slow select count(*)

2011-02-09 Thread felix
On Tue, Feb 8, 2011 at 3:23 PM, Shaun Thomas stho...@peak6.com wrote:


 With 300k rows, count(*) isn't a good test, really. That's just on the edge
 of big-enough that it could be  1-second to fetch from the disk controller,



1 second you say ?  excellent, sign me up

70 seconds is way out of bounds

I don't want a more efficient query to test with, I want the shitty query
that performs badly that isolates an obvious problem.

The default settings are not going to cut it for a database of your size,
 with the volume you say it's getting.


not to mention the map reduce jobs I'm hammering it with all night :)

but I did pause those until this is solved

But you need to put in those kernel parameters I suggested. And I know this
 sucks, but you also have to raise your shared_buffers and possibly your
 work_mem and then restart the DB. But this time, pg_ctl to invoke a fast
 stop, and then use the init script in /etc/init.d to restart it.


I'm getting another slicehost slice. hopefully I can clone the whole thing
over without doing a full install and go screw around with it there.

its a fairly complicated install, even with buildout doing most of the
configuration.


=felix


Re: [PERFORM] Really really slow select count(*)

2011-02-07 Thread felix
+1

this is exactly what I was looking for at the time:  a -t (configtest)
option to pg_ctl

and I think it should fall back to lower shared buffers and log it.

SHOW ALL; would show the used value



On Mon, Feb 7, 2011 at 11:30 AM, Marti Raudsepp ma...@juffo.org wrote:

 On Mon, Feb 7, 2011 at 05:03, Craig Ringer cr...@postnewspapers.com.au
 wrote:
  What would possibly help would be if Pg could fall back to lower
  shared_buffers automatically, screaming about it in the logs but still
  launching. OTOH, many people don't check the logs, so they'd think their
  new setting had taken effect and it hadn't - you've traded one usability
  problem for another. Even if Pg issued WARNING messages to each client
  that connected, lots of (non-psql) clients don't display them, so many
  users would never know.
 
  Do you have a suggestion about how to do this better? The current
  approach is known to be rather unlovely, but nobody's come up with a
  better one that works reasonably and doesn't trample on other System V
  shared memory users that may exist on the system.

 We could do something similar to what Apache does -- provide distros
 with a binary to check the configuration file in advance. This check
 program is launched before the restart command, and if it fails, the
 server is not restarted.

 Regards,
 Marti



Re: [PERFORM] Really really slow select count(*)

2011-02-07 Thread felix
On Mon, Feb 7, 2011 at 6:05 AM, Shaun Thomas stho...@peak6.com wrote:


 That’s one of the things I talked about. To be safe, PG will start to shut
 down but disallow new connections, and **that’s all**. Old connections are
 grandfathered in until they disconnect, and when they all go away, it shuts
 down gracefully.



Well it said Failed to shutdown ...  and then returned
control.
and then proceeded to run for about an hour.

I'm not sure how graceful that is.

I generally take programs at their word.  Failed is clearly past tense.



So far as your Django install, have you activated the memcache contrib.
 module? Your pages should be lazy-caching and rarely depend on the DB, if
 they can.


yes thanks my web app is very finely tuned and is working splendidly.
I've been working on very large sites sites since 1998 and this client has
been with me for 10 years already.  its a fairly high traffic site.

I've only been using postgres since we migrated in May

but it is one particular table on postgres that has shit the sock drawer.




 You should also rarely be doing count(*) on a 300k row table, even if
 everything is cached and speedy.


I'm not

this is a test query that is obviously way out of bounds for acceptable
response.

there is something very very wrong with this table and I need to solve it
ASAP.
other tables that have less updates but similar sizes are not having this
problem.

there are foreign keys pointing to this table so its a bit tricky to just
refill it, but I can think of one way.  I'll have to do that.

its only conjecture that the issue is file space bloat or free map problems.
 those are overall issues that I will get to as soon as I can. but this is
table specific.


 That’s an application design issue you need to address before it’s too
 late, or you have to rush and implement a hasty fix.


it is not an application design issue, though there are always improvements
being made.

Being a DBA sucks sometimes. J


I am not a DBA, I'm just trying to query a 300k row table.

though I am happy to learn more. I know an awful lot about a lot of things.
 but you can't specialize in everything


Re: [PERFORM] Really really slow select count(*)

2011-02-06 Thread felix
BRUTAL


http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html
max_fsm_pages

See Section 
17.4.1http://www.postgresql.org/docs/8.3/interactive/kernel-resources.html#SYSVIPC
for
information on how to adjust those parameters, if necessary.

I see absolutely nothing in there about how to set those parameters.

several hours later (
where is my data directory ?  8.4 shows it in SHOW ALL; 8.3 does not.
conf files ? in the data directory no, its in /etc/postgres/8.3/main
where is pg_ctl ?
what user do I need to be ? postgres
then why was it installed in the home dir of a user that does not have
permissions to use it ??
)


cd /home/crucial/bin

/home/crucial/bin/pg_ctl -D /var/lib/postgresql/8.3/main reload

reload does not reset max_fsm_pages, I need to actually restart the server.

postgres@nestseekers:/home/crucial/bin$ /home/crucial/bin/pg_ctl -D
/var/lib/postgresql/8.3/main restart
waiting for server to shut
down... failed
pg_ctl: server does not shut down


OK, my mistake.   probably I have to disconnect all clients.  I don't want
to do a planned maintenance right now.

so I go to sleep

the server restarts itself an hour later.

but no, it fails to restart because this memory setting you recommend is not
possible without reconfiguring the kernel.


postgres@nestseekers:/home/crucial/bin$ 2011-02-06 05:18:00 EST LOG:  could
not load root certificate file root.crt: No such file or directory
2011-02-06 05:18:00 EST DETAIL:  Will not verify client certificates.
2011-02-06 05:18:00 EST FATAL:  could not create shared memory segment:
Invalid argument
2011-02-06 05:18:00 EST DETAIL:  Failed system call was shmget(key=5432001,
size=35463168, 03600).
2011-02-06 05:18:00 EST HINT:  This error usually means that PostgreSQL's
request for a shared memory segment exceeded your kernel's SHMMAX parameter.
 You can either reduce the request size or reconfigure the kernel with
larger SHMMAX.  To reduce the request size (currently 35463168 bytes),
reduce PostgreSQL's shared_buffers parameter (currently 3072) and/or its
max_connections parameter (currently 103).
If the request size is already small, it's possible that it is less than
your kernel's SHMMIN parameter, in which case raising the request size or
reconfiguring SHMMIN is called for.
The PostgreSQL documentation contains more information about shared memory
configuration.
^C

*and the website is down for the next 6 hours while I sleep.*

total disaster

after a few tries I get it to take an max_fsm_pages of 300k

postgres@nestseekers:/home/crucial/bin$ 2011-02-06 05:19:26 EST LOG:  could
not load root certificate file root.crt: No such file or directory
2011-02-06 05:19:26 EST DETAIL:  Will not verify client certificates.
2011-02-06 05:19:26 EST LOG:  database system was shut down at 2011-02-06
00:07:41 EST
2011-02-06 05:19:27 EST LOG:  autovacuum launcher started
2011-02-06 05:19:27 EST LOG:  database system is ready to accept connections
^C



2011-02-06 05:33:45 EST LOG:  checkpoints are occurring too frequently (21
seconds apart)
2011-02-06 05:33:45 EST HINT:  Consider increasing the configuration
parameter checkpoint_segments.


??


From my perspective: the defaults for postgres 8.3 result in a database that
does not scale and fails dramatically after 6 months.  changing that default
is brutally difficult and can only really be done by adjusting something in
the kernel.


I have clustered that table, its still unbelievably slow.
I still don't know if this bloat due to the small free space map has
anything to do with why the table is performing like this.


On Fri, Feb 4, 2011 at 5:35 PM, Shaun Thomas stho...@peak6.com wrote:


 You can stop the bloating by setting the right max_fsm_pages setting,







 but you'll either have to go through and VACUUM FULL every table in your
 database, or dump/restore to regain all the lost space and performance (the
 later would actually be faster). Before I even touch an older PostgreSQL DB,
 I set it to some value over 3-million just as a starting value to be on the
 safe side. A little used memory is a small price to pay for stopping gradual
 expansion.




Re: [PERFORM] Really really slow select count(*)

2011-02-06 Thread felix
On Sun, Feb 6, 2011 at 4:23 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 Let's review:



1: No test or staging system used before production


no, I do not have a full ubuntu machine replicating the exact memory and
application load of the production server.

this was changing one configuration parameter. something I was advised to
do, read about quite a bit, tested on my development server (mac) and then
proceeded to do at 6 am on Sunday morning, our slowest time.


2: DB left in an unknown state (trying to shut down, not able)


what ?

I checked the site, everything was normal.  I went in via psql and tried
some queries for about half an hour and continued to monitor the site.  then
I went to bed at 7am (EU time).

Why did it shutdown so much later ?

I have never restarted postgres before, so this was all new to me.  I
apologize that I wasn't born innately with such knowledge.

So is it normal for postgres to report that it failed to shut down, operate
for an hour and then go ahead and restart itself ?

3: No monitoring software to tell you when the site is down


of course I have monitoring software.  both external and internal.  but it
doesn't come and kick me out of bed.  yes, I need an automated cel phone
call.  that was the first thing I saw to afterwards.


4: I'm gonna just go ahead and guess no backups were taken either, or
 are regularly taken.


WTF ?   of course I have backups.  I just went through a very harsh down
period event.  I fail to see why it is now necessary for you to launch such
an attack on me.

Perhaps the tone of my post sounded like I was blaming you, or at least you
felt that way.  Why do you feel that way ?

Why not respond with:  ouch !  did you check this ... that  say
something nice and helpful.  correct my mistakes




 This website can't be very important, if that's the way you treat it.


just to let you know, that is straight up offensive

This is high traffic real estate site.  Downtime is unacceptable.  I had
less downtime than this when I migrated to the new platform.

I spent rather a large amount of time reading and questioning here.  I asked
many questions for clarification and didn't do ANYTHING until I was sure it
was the correct solution.  I didn't just pull some shit off a blog and start
changing settings at random.

I double checked opinions against different people and I searched for more
docs on that param.  Amazingly none of the ones I found commented on the
shared memory issue and I didn't even understand the docs discussing shared
memory because it didn't seem to apply to what I was doing.  that's my
misunderstanding.  I come her to share my misunderstanding.




 And my phone starts complaining a minute after the site stops
 responding if something does go wrong the rest of the time.  Do not
 lay this at anyone else's feet.


I didn't.  There is not even the slightest hint of that in my post.

I came here and posted the details of where I went wrong and what confused
me about the documentation that I followed.  That's so other people can
follow it and so somebody here can comment on it.



  changing that default
  is brutally difficult and can only really be done by adjusting something
 in
  the kernel.

 Please, that's a gross exaggeration.  The sum totoal to changing them is:

 run sysctl -a|grep shm
 copy out proper lines to cahnge
 edit sysctl.conf
 put new lines in there with changes
 sudo sysctl -p  # applies changes
 edit the appropriate postgresql.conf, make changes
 sudo /etc/init.d/postgresql-8.3 stop
 sudo /etc/init.d/postgresql-8.3 start


Considering how splendidly the experiment with changing fsm_max_pages went,
I think you can understand that I have no desire to experiment with kernel
settings.

It is easy for you because you ALREADY KNOW everything involved.  I am not a
sysadmin and we don't have one.  My apologies for that.

so does the above mean that I don't have to restart the entire server, just
postgres ?  I assumed that changing kernel settings means rebooting the
server.



 I have clustered that table, its still unbelievably slow.

 Did you actually delete the old entries before clustering it?  if it's
 still got 4G of old sessions or whatever in it, clustering ain't gonna
 help.


its a different table.  the problem one has only 300k rows

the problem is not the size, the problem is the speed is catastrophic



 I still don't know if this bloat due to the small free space map has
  anything to do with why the table is performing like this.

 Since you haven't show us what changes, if any, have happened to the
 table, neither do we :)


sorry, it didn't seem to be the most important topic when I got out of bed


Re: [PERFORM] Really really slow select count(*)

2011-02-06 Thread felix
yeah, it already uses memcached with db save.  nothing important in session
anyway

the session table is not the issue

and I never clustered that one or ever will

thanks for the tip, also the other one about HOT


On Sun, Feb 6, 2011 at 8:19 PM, Pierre C li...@peufeu.com wrote:


  I have clustered that table, its still unbelievably slow.


 Did you actually delete the old entries before clustering it?  if it's
 still got 4G of old sessions or whatever in it, clustering ain't gonna
 help.


 Also, IMHO it is a lot better to store sessions in something like
 memcached, rather than imposing this rather large load on the main
 database...

 PS : if your site has been down for 6 hours, you can TRUNCATE your sessions
 table...



[PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
I am having huge performance problems with a table. Performance deteriorates
every day and I have to run REINDEX and ANALYZE on it every day.  auto
vacuum is on.  yes, I am reading the other thread about count(*) :)

but obviously I'm doing something wrong here


explain analyze select count(*) from fastadder_fastadderstatus;

Aggregate  (cost=62458.73..62458.74 rows=1 width=0) (actual
time=77130.000..77130.000 rows=1 loops=1)
   -  Seq Scan on fastadder_fastadderstatus  (cost=0.00..61701.18
rows=303018 width=0) (actual time=50.000..76930.000 rows=302479 loops=1)
 Total runtime: *77250.000 ms*

directly after REINDEX and ANALYZE:

 Aggregate  (cost=62348.70..62348.71 rows=1 width=0) (actual
time=15830.000..15830.000 rows=1 loops=1)
   -  Seq Scan on fastadder_fastadderstatus  (cost=0.00..61613.16
rows=294216 width=0) (actual time=30.000..15570.000 rows=302479 loops=1)
 Total runtime: 15830.000 ms

still very bad for a 300k row table

a similar table:

explain analyze select count(*) from fastadder_fastadderstatuslog;

 Aggregate  (cost=8332.53..8332.54 rows=1 width=0) (actual
time=1270.000..1270.000 rows=1 loops=1)
   -  Seq Scan on fastadder_fastadderstatuslog  (cost=0.00..7389.02
rows=377402 width=0) (actual time=0.000..910.000 rows=377033 loops=1)
 Total runtime: 1270.000 ms


It gets updated quite a bit each day, and this is perhaps the problem.
To me it doesn't seem like that many updates

100-500 rows inserted per day
no deletes

10k-50k updates per day
mostly of this sort:   set priority=1 where id=12345

is it perhaps this that is causing the performance problem ?

I could rework the app to be more efficient and do updates using batches
where id IN (1,2,3,4...)

I assume that means a more efficient index update compared to individual
updates.

There is one routine that updates position_in_queue using a lot (too many)
update statements.
Is that likely to be the culprit ?

*What else can I do to investigate ?*


   Table
public.fastadder_fastadderstatus
  Column   |   Type   |
  Modifiers
---+--+
 id| integer  | not null default
nextval('fastadder_fastadderstatus_id_seq'::regclass)
 apt_id| integer  | not null
 service_id| integer  | not null
 agent_priority| integer  | not null
 priority  | integer  | not null
 last_validated| timestamp with time zone |
 last_sent | timestamp with time zone |
 last_checked  | timestamp with time zone |
 last_modified | timestamp with time zone | not null
 running_status| integer  |
 validation_status | integer  |
 position_in_queue | integer  |
 sent  | boolean  | not null default false
 built | boolean  | not null default false
 webid_suffix  | integer  |
 build_cache   | text |
Indexes:
fastadder_fastadderstatus_pkey PRIMARY KEY, btree (id)
fastadder_fastadderstatus_apt_id_key UNIQUE, btree (apt_id,
service_id)
fastadder_fastadderstatus_agent_priority btree (agent_priority)
fastadder_fastadderstatus_apt_id btree (apt_id)
fastadder_fastadderstatus_built btree (built)
fastadder_fastadderstatus_last_checked btree (last_checked)
fastadder_fastadderstatus_last_validated btree (last_validated)
fastadder_fastadderstatus_position_in_queue btree (position_in_queue)
fastadder_fastadderstatus_priority btree (priority)
fastadder_fastadderstatus_running_status btree (running_status)
fastadder_fastadderstatus_service_id btree (service_id)
Foreign-key constraints:
fastadder_fastadderstatus_apt_id_fkey FOREIGN KEY (apt_id) REFERENCES
nsproperties_apt(id) DEFERRABLE INITIALLY DEFERRED
fastadder_fastadderstatus_service_id_fkey FOREIGN KEY (service_id)
REFERENCES fastadder_fastadderservice(id) DEFERRABLE INITIALLY DEFERRED


thanks !


Fwd: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
sorry, reply was meant to go to the list.

-- Forwarded message --
From: felix crucialfe...@gmail.com
Date: Fri, Feb 4, 2011 at 5:17 PM
Subject: Re: [PERFORM] Really really slow select count(*)
To: stho...@peak6.com




On Fri, Feb 4, 2011 at 4:00 PM, Shaun Thomas stho...@peak6.com wrote:

 How big is this table when it's acting all bloated and ugly?

458MB

 Is this the only thing running when you're doing your tests? What does your
 disk IO look like?


this is on a live site.  best not to scare the animals.

I have the same config on the dev environment but not the same table size.


  10k-50k updates per day
 mostly of this sort:   set priority=1 where id=12345


 Well... that's up to 16% turnover per day, but even then, regular vacuuming
 should keep it manageable.


something is definitely amiss with this table.

I'm not sure if its something that happened at one point when killing an
task that was writing to it or if its something about the way the app is
updating.  it SHOULDN'T be that much of a problem, though I can find ways to
improve it.


No. Don't do that. You'd be better off loading everything into a temp table
 and doing this:

 UPDATE fastadder_fastadderstatus s
   SET priority = 1
  FROM temp_statuses t
  WHERE t.id=s.id;


ok, that is one the solutions I was thinking about.

are updates of the where id IN (1,2,3,4) generally not efficient ?
how about for select queries ?


 fastadder_fastadderstatus_pkey PRIMARY KEY, btree (id)
 fastadder_fastadderstatus_apt_id_key UNIQUE, btree (apt_id, service_id)
 fastadder_fastadderstatus_agent_priority btree (agent_priority)
 fastadder_fastadderstatus_apt_id btree (apt_id)
 fastadder_fastadderstatus_built btree (built)
 fastadder_fastadderstatus_last_checked btree (last_checked)
 fastadder_fastadderstatus_last_validated btree (last_validated)
 fastadder_fastadderstatus_position_in_queue btree (position_in_queue)
 fastadder_fastadderstatus_priority btree (priority)
 fastadder_fastadderstatus_running_status btree (running_status)
 fastadder_fastadderstatus_service_id btree (service_id)


 Whoh! Hold on, here. That looks like *way* too many indexes.


I actually just added most of those yesterday in an attempt to improve
performance. priority and agent_priority were missing indexes and that was a
big mistake.

overall performance went way up on my primary selects


 Definitely will slow down your insert/update performance.


there are a lot more selects happening throughout the day


 The index on 'built' for example, is a boolean. If it's evenly distributed,
 that's 150k matches for true or false,


ok,

built True is in the minority.

here is the test query that caused me to add indices to the booleans.  this
is a 30k table which is doing selects on two booleans constantly.  again:
True is the minority

explain analyze SELECT nsproperties_apt.id,
nsproperties_apt.display_address, nsproperties_apt.apt_num,
nsproperties_apt.bldg_id, nsproperties_apt.is_rental,
nsproperties_apt.is_furnished, nsproperties_apt.listing_type,
nsproperties_apt.list_on_web, nsproperties_apt.is_approved,
nsproperties_apt.status, nsproperties_apt.headline,
nsproperties_apt.slug, nsproperties_apt.cross_street,
nsproperties_apt.show_apt_num, nsproperties_apt.show_building_name,
nsproperties_apt.external_url, nsproperties_apt.listed_on,
nsproperties_bldg.id, nsproperties_bldg.name FROM nsproperties_apt
LEFT OUTER JOIN nsproperties_bldg ON (nsproperties_apt.bldg_id =
nsproperties_bldg.id) WHERE (nsproperties_apt.list_on_web = True AND
nsproperties_apt.is_available = True ) ;
   QUERY PLAN


 Hash Left Join  (cost=408.74..10062.18 rows=3344 width=152) (actual
time=12.688..2442.542 rows=2640 loops=1)
   Hash Cond: (nsproperties_apt.bldg_id = nsproperties_bldg.id)
   -  Seq Scan on nsproperties_apt  (cost=0.00..9602.52 rows=3344
width=139) (actual time=0.025..2411.644 rows=2640 loops=1)
 Filter: (list_on_web AND is_available)
   -  Hash  (cost=346.66..346.66 rows=4966 width=13) (actual
time=12.646..12.646 rows=4966 loops=1)
 -  Seq Scan on nsproperties_bldg  (cost=0.00..346.66 rows=4966
width=13) (actual time=0.036..8.236 rows=4966 loops=1)
 Total runtime: 2444.067 ms
(7 rows)

=

 Hash Left Join  (cost=1232.45..9784.18 rows=5690 width=173) (actual
time=30.000..100.000 rows=5076 loops=1)
   Hash Cond: (nsproperties_apt.bldg_id = nsproperties_bldg.id)
   -  Bitmap Heap Scan on nsproperties_apt  (cost=618.23..9075.84 rows=5690
width=157) (actual time=10.000..60.000 rows=5076 loops=1)
 Filter: (list_on_web AND is_available)
 -  BitmapAnd  (cost=618.23..618.23 rows=5690 width=0) (actual
time=10.000..10.000 rows=0 loops=1)
   -  Bitmap Index Scan on nsproperties_apt_is_available
 (cost=0.00..131.81 rows=6874 width=0) (actual time=0.000..0.000 rows

[PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
reply was meant for the list

-- Forwarded message --
From: felix crucialfe...@gmail.com
Date: Fri, Feb 4, 2011 at 4:39 PM
Subject: Re: [PERFORM] Really really slow select count(*)
To: Greg Smith g...@2ndquadrant.com




On Fri, Feb 4, 2011 at 3:56 PM, Greg Smith g...@2ndquadrant.com wrote:

 PostgreSQL version?  If you're running on 8.3 or earlier, I would be
 suspicous that your Free Space Map has been overrun.


8.3




 What you are seeing is that the table itself is much larger on disk than
 it's supposed to be.


which part of the explain told you that ?

 shaun thomas

SELECT relpages*8/1024 FROM pg_class
 WHERE relname='fastadder_fastadderstatus';

458MB

way too big. build_cache is text between 500-1k chars




  That can be caused by frequent UPDATEs if you don't have vacuum cleanup
 working effectively, you'll get lots of dead sections left behind from
 UPDATEs in the middle.


ok, I just vacuumed it (did this manually a few times as well). and auto is
on.

still:
32840.000ms
and still 458MB



 The best way to fix all this is to run CLUSTER on the table.


http://www.postgresonline.com/journal/archives/10-How-does-CLUSTER-ON-improve-index-performance.html

now that would order the data on disk by id (primary key)
the usage of the table is either by a query or by position_in_queue which is
rewritten often (I might change this part of the app and pull it out of this
table)

is this definitely the best way to fix this ?

thanks for your help !


That will introduce a bit of downtime while it holds a lock on the table
 (only a few minutes based on what you've shown here), but the copy you'll
 have afterwards won't be spread all over disk anymore.

 --
 Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
 PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
 PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books




Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
On Fri, Feb 4, 2011 at 5:35 PM, Shaun Thomas stho...@peak6.com wrote:



 vacuumdb -a -v -z vacuum.log

 And at the end of the log, it'll tell you how many pages it wants, and how
 many pages were available.


this is the dev, not live. but this is after it gets done with that table:

CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing public.fastadder_fastadderstatus
INFO:  fastadder_fastadderstatus: scanned 2492 of 2492 pages, containing
154378 live rows and 0 dead rows; 3 rows in sample, 154378 estimated
total rows

and there's nothing at the end of the whole vacuum output about pages

actual command:

vacuumdb -U postgres -W -v -z djns4  vacuum.log

I tried it with all databases too

?

thanks


Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
vacuumdb -a -v -z -U postgres -W  vacuum.log

that's all, isn't it ?

it did each db

8.3 in case that matters

the very end:

There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing public.seo_partnerlinkcategory
INFO:  seo_partnerlinkcategory: scanned 0 of 0 pages, containing 0 live
rows and 0 dead rows; 0 rows in sample, 0 estimated total rows



On Fri, Feb 4, 2011 at 6:40 PM, Scott Marlowe scott.marl...@gmail.comwrote:


  I tried it with all databases too

 I believe you have to run it on the whole db to get that output.



Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
vacuumdb -a -v -z -U postgres -W  vacuum.log
Password:
Password:
Password:
Password:
Password:
Password:
Password:
Password:
Password:
Password:
Password:
cruxnu:nsbuildout crucial$

do you think its possible that it just doesn't have anything to complain
about ?
or the password is affecting it ?

In any case I'm not sure I want to run this even at night on production.

what is the downside to estimating max_fsm_pages too high ?

300 should be safe
its certainly not 150k

I have one very large table (10m) that is being analyzed before I warehouse
it.
that could've been the monster that ate the free map.
I think today I've learned that even unused tables affect postgres
performance.


and do you agree that I should turn CLUSTER ON ?
I have no problem to stop all tasks to this table at night and just reload
it



On Fri, Feb 4, 2011 at 6:47 PM, Shaun Thomas stho...@peak6.com wrote:

 On 02/04/2011 11:44 AM, felix wrote:

  the very end:

 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  analyzing public.seo_partnerlinkcategory
 INFO: seo_partnerlinkcategory: scanned 0 of 0 pages, containing 0 live
 rows and 0 dead rows; 0 rows in sample, 0 estimated total rows


 That looks to me like it didn't finish. Did you fork it off with '' or run
 it and wait until it gave control back to you?

 It really should be telling you how many pages it wanted, and are in use.
 If not, something odd is going on.


 --
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
 312-676-8870
 stho...@peak6.com

 __

 See  http://www.peak6.com/email_disclaimer.php
 for terms and conditions related to this email



Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
On Fri, Feb 4, 2011 at 7:34 PM, Shaun Thomas stho...@peak6.com wrote:

 Why is it asking for the password over and over again? It shouldn't be
 doing that.


because I asked it to: -W
on the production server I need to enter password and I'm testing on dev
first.

I just sudo tried it but still no report


 and do you agree that I should turn CLUSTER ON ?


 Cluster isn't really something you turn on, but something you do.


djns4=# cluster fastadder_fastadderstatus;
ERROR:  there is no previously clustered index for table
fastadder_fastadderstatus

http://www.postgresonline.com/journal/archives/10-How-does-CLUSTER-ON-improve-index-performance.html

djns4=# alter table fastadder_fastadderstatus CLUSTER ON
fastadder_fastadderstatus_pkey; ALTER TABLE djns4=# CLUSTER
fastadder_fastadderstatus; CLUSTER

ok, that's why I figured I was turning something on. the table has been
altered.

it will be pk ordered, new entries always at the end and no deletes

but this means I have to manually run cluster from time to time, right ? not
that there will be much or any reordering.  or it should be fine going
forward with vacuum and enlarging the free space memory map.



 It's like vacuum full, in that it basically rebuilds the table and all
 indexes from scratch. The major issue you'll run into is that it reorders
 the table by the index you chose, so you'd best select the primary key
 unless you have reasons to use something else. And you have to do it table
 by table, which will really suck since we already know your whole db has
 bloated, not just one or two tables.


do we know that ?  many of the tables are fairly static.

only this one is seriously borked, and yet other related tables seem to be
fine.




 You're going to be doing some scripting, buddy. :) Well, unless you just do
 a dump/restore and start over with sane postgresql.conf settings.


well who knew the defaults were unsane ? :)

scripting this is trivial, I already have the script

I have made the mistake of doing VACUUM FULL in the past. in fact on this
table, and it had to be killed because it took down my entire website !
 that may well be the major borking event. a credit to postgres that the
table still functions if that's the case.

scott marlowe:

begin;
 select * into temporaryholdingtable order by somefield;
 truncate oldtable;
 insert into oldtables select * from temporaryholdingtable;
 commit;


that sounds like a good approach.

gentlemen, 300,000 + thanks for your generous time !
(a small number, I know)

-felix


Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
ah right, duh.
yes, I did it as -U postgres, verified as a superuser

just now did it from inside psql as postgres

\c djns4
vacuum verbose analyze;

still no advice on the pages



On Fri, Feb 4, 2011 at 8:34 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Fri, Feb 4, 2011 at 12:26 PM, felix crucialfe...@gmail.com wrote:
  I just sudo tried it but still no report

 It's not about who you are in Unix / Linux, it's about who you are in
 Postgresql.  \du will show you who is a superusr.  psql -U username
 will let you connect as that user.



Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread felix
it probably has good reason to hate me.



ns= SELECT n.nspname AS schema_name, c.relname AS table_name,
ns-   c.reltuples AS row_count,
ns-   c.relpages*8/1024 AS mb_used,
ns-   pg_total_relation_size(c.oid)/1024/1024 AS total_mb_used
ns-  FROM pg_class c
ns-  JOIN pg_namespace n ON (n.oid=c.relnamespace)
ns-  WHERE c.relkind = 'r'
ns-  ORDER BY total_mb_used DESC
ns-  LIMIT 20;
 schema_name |table_name|  row_count  | mb_used |
total_mb_used
-+--+-+-+---
 public  | django_session   | 1.47843e+07 |4122 |
  18832
 public  | traffic_tracking2010 | 9.81985e+06 | 811 |
   1653
 public  | mailer_mailingmessagelog | 7.20214e+06 | 441 |
   1082
 public  | auth_user| 3.20077e+06 | 572 |
791
 public  | fastadder_fastadderstatus|  302479 | 458 |
693
 public  | registration_registrationprofile | 3.01345e+06 | 248 |
404
 public  | reporting_dp_6c93734c|  1.1741e+06 |  82 |
224
 public  | peoplez_contact  |   79759 |  18 |
221
 public  | traffic_tracking201101   | 1.49972e+06 | 163 |
204
 public  | reporting_dp_a3439e2a| 1.32739e+06 |  82 |
187
 public  | nsproperties_apthistory  |   44906 |  69 |
126
 public  | nsproperties_apt |   30780 |  71 |
125
 public  | clients_showingrequest   |   85175 |  77 |
103
 public  | reporting_dp_4ffe04ad|  330252 |  26 |
 63
 public  | fastadder_fastadderstatuslog |  377402 |  28 |
 60
 public  | nsmailings_officememotoagent |  268345 |  15 |
 52
 public  | celery_taskmeta  |5041 |  12 |
 32
 public  | mailer_messagelog|  168298 |  24 |
 32
 public  | datapoints_job   |9167 |  12 |
 23
 public  | fastadder_fastadderstatus_errors |  146314 |   7 |
 21

oh and there in the footnotes to django they say dont' forget to run the
delete expired sessions management every once in a while. thanks guys.

it won't run now because its too big, I can delete them from psql though

well just think how sprightly my website will run tomorrow once I fix these.




On Fri, Feb 4, 2011 at 9:00 PM, Shaun Thomas stho...@peak6.com wrote:

 On 02/04/2011 01:59 PM, felix wrote:


  still no advice on the pages


 I think it just hates you.


 --
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
 312-676-8870
 stho...@peak6.com

 __

 See  http://www.peak6.com/email_disclaimer.php
 for terms and conditions related to this email



Re: [PERFORM] Update problem on large table

2010-12-06 Thread felix
thanks for the replies !,

but actually I did figure out how to kill it

but pb_cancel_backend didn't work.  here's some notes:

this has been hung for 5 days:
ns  |   32681 | nssql   | IDLE in transaction | f   | 2010-12-01
15

resulting in:  fastadder_fastadderstatus: scanned 3000 of 58551 pages,
containing 13587 live rows and 254709 dead rows;
and resulting in general pandemonium


you need to become the postgres superuser to use pg_cancel_backend:
 su postgres
 psql

and then:

select pg_cancel_backend(32681);

but this does not kill the IDLE in transaction processes.
it returns true, but its still there

from the linux shell I tried:

pg_ctl kill INT 32681

but it still will not die

the docs for pg_ctl state:
Use pb_ctl --help to see a list of supported signal names.

doing so does indeed tell me the names:

HUP INT QUIT ABRT TERM USR1 USR2

but nothing about them whatseover :)

throwing caution to the wind:

pg_ctl kill TERM 32681

and that did it

ran VACUUM and now performance has returned to normal.

lessons learned.

I guess as Josh says, pg_cancel_backend is the same as SIGINT, which also
failed for me using pg_ctl.
not sure why.  the hung transaction was doing something like update table
set field = null where service_id = x



On Mon, Dec 6, 2010 at 9:26 PM, Kenneth Marshall k...@rice.edu wrote:

 On Mon, Dec 06, 2010 at 03:24:31PM -0500, Josh Kupershmidt wrote:
  On Mon, Dec 6, 2010 at 2:48 PM, Jon Nelson 
  jnelson+pg...@jamponi.netjnelson%2bpg...@jamponi.net
 wrote:
   On Mon, Dec 6, 2010 at 1:46 PM, bricklen brick...@gmail.com wrote:
   Not sure if anyone replied about killing your query, but you can do it
 like so:
  
   select pg_cancel_backend(5902); ?-- assuming 5902 is the pid of the
   query you want canceled.
  
   How does this differ from just killing the pid?
 
  pg_cancel_backend(5902) does the same thing as:
kill -SIGINT 5902
 
  Josh
 

 Yes, but you can use it from within the database. The kill command
 requires shell access to the backend.

 Cheers,
 Ken



Re: [PERFORM] Update problem on large table

2010-12-04 Thread felix
Ok, I caught one : an update that is stuck in waiting.

the first one blocks the second one.

ns  |5902 | nssql   | UPDATE fastadder_fastadderstatus SET built
= false WHERE fastadder_fastadderstatus.service_id = 1











| f   |
2010-12-04 13:44:38.5228-05   | 127.0.0.1

 ns  |7000 | nssql   | UPDATE fastadder_fastadderstatus SET
last_sent = E'2010-12-04 13:50:51.452800', sent = true WHERE
(fastadder_fastadderstatus.built = true  AND
fastadder_fastadderstatus.service_id = 1 )










  | t   | 2010-12-04 13:50:51.4628-05
| 127.0.0.1

is it possible to release the lock and/or cancel the query ?  the process
that initiated the first one is long ceased.






On Fri, Nov 26, 2010 at 6:02 PM, bricklen brick...@gmail.com wrote:

 No problem!

 On Fri, Nov 26, 2010 at 8:34 AM, felix crucialfe...@gmail.com wrote:
  thanks !
  of course now, 2 hours later, the queries run fine.
  the first one was locked up for so long that I interrupted it.
  maybe that caused it to get blocked
  saved your query for future reference, thanks again !
  On Fri, Nov 26, 2010 at 5:00 PM, bricklen brick...@gmail.com wrote:
 
  On Fri, Nov 26, 2010 at 6:22 AM, felix crucialfe...@gmail.com wrote:
  
   Hello,
   I have a very large table that I'm not too fond of.  I'm revising the
   design
   now.
   Up until now its been insert only, storing tracking codes from
 incoming
   webtraffic.
   It has 8m rows
   It appears to insert fine, but simple updates using psql are hanging.
   update ONLY traffic_tracking2010 set src_content_type_id = 90 where id
 =
   90322;
   I am also now trying to remove the constraints, this also hangs.
   alter table traffic_tracking2010 drop constraint
   traffic_tracking2010_src_content_type_id_fkey;
   thanks in advance for any advice.
 
  Try your update or alter and in another session, run the following
  query and look at the waiting column. A true value means that that
  transaction is blocked.
 
  SELECT pg_stat_activity.datname, pg_stat_activity.procpid,
  pg_stat_activity.usename, pg_stat_activity.current_query,
  pg_stat_activity.waiting,
  pg_stat_activity.query_start,pg_stat_activity.client_addr
  FROM pg_stat_activity
  WHERE ((pg_stat_activity.procpid  pg_backend_pid())
  AND (pg_stat_activity.current_query  'IDLE'))
  ORDER BY pg_stat_activity.query_start;
 
  --
  Sent via pgsql-performance mailing list (
 pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 
 



[PERFORM] Update problem on large table

2010-11-26 Thread felix
Hello,
I have a very large table that I'm not too fond of.  I'm revising the design
now.

Up until now its been insert only, storing tracking codes from incoming
webtraffic.

It has 8m rows
It appears to insert fine, but simple updates using psql are hanging.

update ONLY traffic_tracking2010 set src_content_type_id = 90 where id =
90322;

I am also now trying to remove the constraints, this also hangs.

alter table traffic_tracking2010 drop constraint
traffic_tracking2010_src_content_type_id_fkey;

thanks in advance for any advice.


Table public.traffic_tracking2010
   Column|   Type   |
  Modifiers
-+--+---
 id  | integer  | not null default
nextval('traffic_tracking2010_id_seq'::regclass)
 action_time | timestamp with time zone | not null
 user_id | integer  |
 content_type_id | integer  |
 object_id   | integer  |
 action_type | smallint | not null
 src_type| smallint |
 src_content_type_id | integer  |
 src_object_id   | integer  |
 http_referrer   | character varying(100)   |
 search_term | character varying(50)|
 remote_addr | inet | not null
Indexes:
traffic_tracking2010_pkey PRIMARY KEY, btree (id)
traffic_tracking2010_content_type_id btree (content_type_id)
traffic_tracking2010_src_content_type_id btree (src_content_type_id)
traffic_tracking2010_user_id btree (user_id)
Foreign-key constraints:
traffic_tracking2010_content_type_id_fkey FOREIGN KEY
(content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY
DEFERRED
traffic_tracking2010_src_content_type_id_fkey FOREIGN KEY
(src_content_type_id) REFERENCES django_content_type(id) DEFERRABLE
INITIALLY DEFERRED
traffic_tracking2010_user_id_fkey FOREIGN KEY (user_id) REFERENCES
auth_user(id) DEFERRABLE INITIALLY DEFERRED


This is generated by Django's ORM.

The hang may be do having other clients connected, though I have tried doing
the update when I know all tracking inserts are stopped.
But the other client (the webapp) is still connected.

based on this:
http://postgresql.1045698.n5.nabble.com/slow-full-table-update-td2070754.html

ns= ANALYZE traffic_tracking2010;
ANALYZE
ns= SELECT relpages, reltuples FROM pg_class WHERE relname =
'traffic_tracking2010';
 relpages |  reltuples
--+-
99037 | 8.38355e+06

and I did vacuum it

vacuum verbose traffic_tracking2010;
INFO:  vacuuming public.traffic_tracking2010
INFO:  scanned index traffic_tracking2010_pkey to remove 1057 row versions
DETAIL:  CPU 0.09s/0.37u sec elapsed 10.70 sec.
INFO:  scanned index traffic_tracking2010_user_id to remove 1057 row
versions
DETAIL:  CPU 0.12s/0.30u sec elapsed 13.53 sec.
INFO:  scanned index traffic_tracking2010_content_type_id to remove 1057
row versions
DETAIL:  CPU 0.11s/0.28u sec elapsed 13.99 sec.
INFO:  scanned index traffic_tracking2010_src_content_type_id to remove
1057 row versions
DETAIL:  CPU 0.09s/0.26u sec elapsed 15.57 sec.
INFO:  traffic_tracking2010: removed 1057 row versions in 535 pages
DETAIL:  CPU 0.01s/0.02u sec elapsed 2.83 sec.
INFO:  index traffic_tracking2010_pkey now contains 8315147 row versions
in 22787 pages
DETAIL:  1057 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index traffic_tracking2010_user_id now contains 8315147 row
versions in 29006 pages
DETAIL:  1057 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index traffic_tracking2010_content_type_id now contains 8315147 row
versions in 28980 pages
DETAIL:  1057 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index traffic_tracking2010_src_content_type_id now contains 8315147
row versions in 28978 pages
DETAIL:  1057 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  traffic_tracking2010: found 336 removable, 8315147 nonremovable row
versions in 99035 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
25953 pages contain useful free space.
0 pages are entirely empty.
CPU 0.78s/1.49u sec elapsed 100.43 sec.
INFO:  vacuuming pg_toast.pg_toast_165961
INFO:  index pg_toast_165961_index now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  pg_toast_165961: found 0 removable, 0 nonremovable row 

Re: [PERFORM] Unsubscribe

2006-10-03 Thread felix
On Mon, Oct 02, 2006 at 01:36:17PM -0400, uwcssa wrote:
  Please unsubscribe me!  Thank you!
 
 
 Also, it would be better to have a message foot saying how to unsubscribe.

Will this do?  It's too big for a footer.

Here's how to unsubscribe:

First, ask your Internet Provider to mail you an Unsubscribing Kit.
Then follow these directions.

The kit will most likely be the standard no-fault type. Depending on
requirements, System A and/or System B can be used. When operating
System A, depress lever and a plastic dalkron unsubscriber will be
dispensed through the slot immediately underneath. When you have
fastened the adhesive lip, attach connection marked by the large X
outlet hose. Twist the silver-coloured ring one inch below the
connection point until you feel it lock.

The kit is now ready for use. The Cin-Eliminator is activated by the
small switch on the lip.  When securing, twist the ring back to its
initial condition, so that the two orange lines meet.  Disconnect.
Place the dalkron unsubscriber in the vacuum receptacle to the rear.
Activate by pressing the blue button.

The controls for System B are located on the opposite side. The red
release switch places the Cin-Eliminator into position; it can be
adjusted manually up or down by pressing the blue manual release
button. The opening is self-adjusting. To secure after use, press the
green button, which simultaneously activates the evaporator and
returns the Cin-Eliminator to its storage position.

You may log off if the green exit light is on over the evaporator.  If
the red light is illuminated, one of the Cin-Eliminator requirements
has not been properly implemented. Press the List Guy call button on
the right of the evaporator. He will secure all facilities from his
control panel.

To use the Auto-Unsub, first undress and place all your clothes in the
clothes rack. Put on the velcro slippers located in the cabinet
immediately below. Enter the shower, taking the entire kit with
you. On the control panel to your upper right upon entering you will
see a Shower seal button. Press to activate. A green light will then
be illuminated immediately below. On the intensity knob, select the
desired setting. Now depress the Auto-Unsub activation lever. Bathe
normally.

The Auto-Unsub will automatically go off after three minutes unless
you activate the Manual off override switch by flipping it up. When
you are ready to leave, press the blue Shower seal release
button. The door will open and you may leave. Please remove the velcro
slippers and place them in their container.

If you prefer the ultrasonic log-off mode, press the indicated blue
button. When the twin panels open, pull forward by rings A  B. The
knob to the left, just below the blue light, has three settings, low,
medium or high. For normal use, the medium setting is suggested.

After these settings have been made, you can activate the device by
switching to the ON position the clearly marked red switch. If
during the unsubscribing operation you wish to change the settings,
place the manual off override switch in the OFF position. You may
now make the change and repeat the cycle. When the green exit light
goes on, you may log off and have lunch. Please close the door behind
you.


-- 
... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
 Felix Finch: scarecrow repairman  rocket surgeon / [EMAIL PROTECTED]
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match