Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Shridhar Daithankar
Dror Matalon wrote: I smell a religious war in the aii:-). Can you go several days in a row without doing select count(*) on any of your tables? I suspect that this is somewhat a domain specific issue. In some areas you don't need to know the total number of rows in your tables, in others you d

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Gregory S. Williamson
I can tell you that this is one of the first thing applications' programmers and IT managers notice. It can slightly tarnish postgres' image when it takes it many long seconds to do what other databases can do in a snap. The "whys and wherefores" can be hard to get across once they see the comp

Re: [PERFORM] Is This My Speed Limit?

2003-10-02 Thread Tomasz Myrta
> Hi! > > It's just my curiosity. I wonder if there is any way to break my speed > limit on AMD 450Mhz: > Hash Join (cost=189.79..1508.67 rows=11203 width=48) (actual > time=129.20..1780.53 rows=9912 loops=1) > Hash Join (cost=208.74..1751.68 rows=11203 width=58) (actual > time=135.87..111

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread Josh Berkus
Oleg, > I have another question. How do I optimize my indexes for the query that > contains a lot of ORed blocks, each of which contains a bunch of ANDed > expressions? The structure of each ORed block is the same except the > right-hand-side values vary. Given the example, I'd do a multicolumn i

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Greg Stark
Christopher Browne <[EMAIL PROTECTED]> writes: > It would be very hairy to implement it correctly, and all this would > cover is the single case of "SELECT COUNT(*) FROM SOME_TABLE;" > > If you had a single WHERE clause attached, you would have to revert to > walking through the tuples looking f

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Dror Matalon
I smell a religious war in the aii:-). Can you go several days in a row without doing select count(*) on any of your tables? I suspect that this is somewhat a domain specific issue. In some areas you don't need to know the total number of rows in your tables, in others you do. I also suspect

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] (Dror Matalon) wrote: > I don't have an opinion on how hard it would be to implement the > tracking in the indexes, but "select count(*) from some table" is, in my > experience, a query that people tend to run quite often. > One of the databases that I've us

[PERFORM] Is This My Speed Limit?

2003-10-02 Thread CN
Hi! It's just my curiosity. I wonder if there is any way to break my speed limit on AMD 450Mhz: Best Regards, CN --- --This table contains 1036 rows. CREATE TABLE table1 ( c1 VARCHAR(20) PRIMARY KEY, c2 "char" )WITHOUT OIDS; - --This table contains 9429 rows.

Re: [PERFORM] runtime of the same query in function differs on 2 degree!

2003-10-02 Thread Gaetano Mendola
Andriy Tkachuk wrote: Hi folks. What's wrong with planner that executes my query in function?: (i mean no explanation but runtime) tele=# EXPLAIN analyze select calc_total(6916799, 1062363600, 1064955599); QUERY PLAN

Re: [PERFORM] further testing on IDE drives

2003-10-02 Thread scott.marlowe
On Thu, 2 Oct 2003, scott.marlowe wrote: > I was testing to get some idea of how to speed up the speed of pgbench > with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 > /dev/hdx). > > The only parameter that seems to make a noticeable difference was setting > wal_sync_m

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Dror Matalon
I don't have an opinion on how hard it would be to implement the tracking in the indexes, but "select count(*) from some table" is, in my experience, a query that people tend to run quite often. One of the databases that I've used, I believe it was Informix, had that info cached so that it always

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread Oleg Lebedev
Thanks everyone for the help. I have another question. How do I optimize my indexes for the query that contains a lot of ORed blocks, each of which contains a bunch of ANDed expressions? The structure of each ORed block is the same except the right-hand-side values vary. The first expression of e

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Christopher Browne
[EMAIL PROTECTED] (Jean-Luc Lachance) writes: > That's one of the draw back of MVCC. > I once suggested that the transaction number and other house keeping > info be included in the index, but was told to forget it... > It would solve once and for all the issue of seq_scan vs index_scan. > It wou

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Jean-Luc Lachance
That's one of the draw back of MVCC. I once suggested that the transaction number and other house keeping info be included in the index, but was told to forget it... It would solve once and for all the issue of seq_scan vs index_scan. It would simplify the aggregate problem. Bruno Wolff III wro

Thanks - Re: [PERFORM] low cardinality column

2003-10-02 Thread Rong Wu
Thanks, Rod, Josh and Bill, That' fantastic. have a nice day, rong :-) > Rod Taylor wrote: >> On Thu, 2003-10-02 at 14:30, Rong Wu wrote: >> >>>Hi, >>> >>>I have a select like this: >>> >>>SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0; >> >> >> For various reasons (prima

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread scott.marlowe
On Thu, 2 Oct 2003, Oleg Lebedev wrote: > I was trying to get the pg_stats information to Josh and decided to > recreate the indexes on all my tables. After that I ran vacuum full > analyze, re-enabled nestloop and ran explain analyze on the query. It > ran in about 2 minutes. > I attached the new

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Bruno Wolff III
On Thu, Oct 02, 2003 at 12:46:45 -0700, Dror Matalon <[EMAIL PROTECTED]> wrote: Please keep replies copied to the list. > When would it happen that a tuple be invisible to the current > transaction? Are we talking about permissions? They could be tuples that were changed by a transaction that

[PERFORM] further testing on IDE drives

2003-10-02 Thread scott.marlowe
I was testing to get some idea of how to speed up the speed of pgbench with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 /dev/hdx). The only parameter that seems to make a noticeable difference was setting wal_sync_method = open_sync. With it set to either fsync, or fd

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread Oleg Lebedev
I was trying to get the pg_stats information to Josh and decided to recreate the indexes on all my tables. After that I ran vacuum full analyze, re-enabled nestloop and ran explain analyze on the query. It ran in about 2 minutes. I attached the new query plan. I am not sure what did the trick, but

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Tomasz Myrta
Hi, I have a somewhat large table, 3 million rows, 1 Gig on disk, and growing. Doing a count(*) takes around 40 seconds. Looks like the count(*) fetches the table from disk and goes through it. Made me wonder, why the optimizer doesn't just choose the smallest index which in my case is around 60

Re: [PERFORM] count(*) slow on large tables

2003-10-02 Thread Bruno Wolff III
On Thu, Oct 02, 2003 at 12:15:47 -0700, Dror Matalon <[EMAIL PROTECTED]> wrote: > Hi, > > I have a somewhat large table, 3 million rows, 1 Gig on disk, and growing. Doing a > count(*) takes around 40 seconds. > > Looks like the count(*) fetches the table from disk and goes through it. > Made m

[PERFORM] count(*) slow on large tables

2003-10-02 Thread Dror Matalon
Hi, I have a somewhat large table, 3 million rows, 1 Gig on disk, and growing. Doing a count(*) takes around 40 seconds. Looks like the count(*) fetches the table from disk and goes through it. Made me wonder, why the optimizer doesn't just choose the smallest index which in my case is around 60

Re: [PERFORM] low cardinality column

2003-10-02 Thread Bill Moran
Rod Taylor wrote: On Thu, 2003-10-02 at 14:30, Rong Wu wrote: Hi, I have a select like this: SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0; For various reasons (primarily MVCC and the ability to make custom aggregates making it difficult) MAX() is not optimized in th

Re: [PERFORM] low cardinality column

2003-10-02 Thread Rod Taylor
On Thu, 2003-10-02 at 14:30, Rong Wu wrote: > Hi, > > I have a select like this: > > SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0; For various reasons (primarily MVCC and the ability to make custom aggregates making it difficult) MAX() is not optimized in this fashion

Re: [PERFORM] low cardinality column

2003-10-02 Thread Josh Berkus
Rong, > I have a select like this: > > SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0; Simple workaround: Create an mulit-column index on transactiontypeid, transactionid. SELECT transactionid FROM cbtransaction WHERE transactiontypeid=0 ORDER BY transactionid DESC LI

[PERFORM] low cardinality column

2003-10-02 Thread Rong Wu
Hi, I have a select like this: SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0; in the query: transactionid is the primary key of cbntransaction table, But transactiontypeid is a low cardinality column, there're over 100,000 records has the same trnsactiontypeid. I was tr

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread scott.marlowe
Have you tried increasing the statistics target for those columns that are getting bad estimates yet and then turning back on enable_nestloop and rerunning analyze and seeing how the query does? The idea being to try and get a good enough estimate of your statistics so the planner stops using

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread Josh Berkus
Oleg, > I ran VACUUM FULL ANALYZE yesterday and the re-ran the query with > EXPLAIN ANALYZE. > I got the same query plan and execution time. How about my question? Those rows from pg_stats would be really useful in diagnosing the problem. -- Josh Berkus Aglio Database Solutions San Francisco

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread Oleg Lebedev
As Scott recommended, I did the following: # set enable_nestloop = false; # vacuum full analyze; After this I re-ran the query and its execution time went down from 2 hours to 2 minutes. I attached the new query plan to this posting. Is there any way to optimize it even further? What should I do t

Re: [PERFORM] runtime of the same query in function differs on 2 degree!

2003-10-02 Thread Tom Lane
Andriy Tkachuk <[EMAIL PROTECTED]> writes: > What's wrong with planner that executes my query in function?: > tele=# EXPLAIN analyze select sum(cost) from bills where (parent(user_id) = 6916799 > or user_id = 6916799) and dat >= 1062363600 and dat < 10649555 > 99; In the function case, the plann

[PERFORM] runtime of the same query in function differs on 2 degree!

2003-10-02 Thread Andriy Tkachuk
Hi folks. What's wrong with planner that executes my query in function?: (i mean no explanation but runtime) tele=# EXPLAIN analyze select calc_total(6916799, 1062363600, 1064955599); QUERY PLAN -

Re: [PERFORM] advice on raid controller

2003-10-02 Thread Vivek Khera
> "RJ" == Richard Jones <[EMAIL PROTECTED]> writes: RJ> Hi, i'm on the verge of buying a "MegaRAID SCSI 320-2" raid controller. RJ> I need it to build a db server using 4x ultra320 scsi disks RJ> i'm thinking raid 1+0 but will try with raid5 too and compare No specific tips on that particular

Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread Oleg Lebedev
I ran VACUUM FULL ANALYZE yesterday and the re-ran the query with EXPLAIN ANALYZE. I got the same query plan and execution time. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 01, 2003 4:20 PM To: Oleg Lebedev Cc: Josh Berkus; scott.marlowe; [EMAIL P

Re: [PERFORM] advice on raid controller

2003-10-02 Thread Vivek Khera
> "PG" == Palle Girgensohn <[EMAIL PROTECTED]> writes: PG> Come to think of it, I guess a battery-backed cache will make fsync as PG> fast as no fsync, right? So, the q was kinda stoopid... :-/ In my testing, yes, the battery cache makes fsync=true just about as fast as fsync=false. it was o

[PERFORM] basket, eggs & NAS (was eggs Re: [NOVICE] Ideal Hardware?)

2003-10-02 Thread Ron Johnson
On Wed, 2003-10-01 at 10:13, Jason Hihn wrote: > We have an opportunity to purchase a new, top-notch database server. I am > wondering what kind of hardware is recommended? We're on Linux platforms and > kernels though. I remember a comment from Tom about how he was spending a > lot of time debuggi

Re: [PERFORM] inferior SCSI performance

2003-10-02 Thread Ang Chin Han
Andrew Sullivan wrote: Yes. If and only if you have a battery-backed cache. I know of no IDE drives that have that, but there's nothing about the spec which makes it impossible. http://www.ussg.iu.edu/hypermail/linux/kernel/0103.0/1084.html Relevant section: Maybe that is why there is a vender

Re: [PERFORM] inferior SCSI performance

2003-10-02 Thread Greg Stark
> > Unfortunately, while there are companies hawking SSDs, they are in the > > "you'll have to talk to our salescritter for pricing" category, which > > means that they must be ferociously expensive. :-(. > > the cheapest I found was the one with external backup power was ~1.8k$ > for 2GB PCI de

Re: [PERFORM] inferior SCSI performance

2003-10-02 Thread Hannu Krosing
Christopher Browne kirjutas K, 01.10.2003 kell 19:21: > > The FS-related result appeared surprising, as the "stories" I had > heard suggested that JFS hadn't been particularly heavily tuned on > Linux, whereas XFS was supposed to be the "speed demon." Gentoo linux recommends XFS only for SAN+fib