from:"Joel Fradkin"

Re: [PERFORM] SAN vs Internal Disks

2007-09-06 Thread Joel Fradkin

I am not sure I agree with that evaluation.
I only have 2 dell database servers and they have been 100% reliable.
Maybe he is referring to support which does tend be up to who you get.
When I asked about performance on my new server they were very helpful but I
did have a bad time on my NAS device (but had the really cheap support plan
on it). They did help me get it fixed but I had to RMA all the drives on the
NAS as they were all bad and it was no fun installing the os as it had no
floppy. I got the better support for both the data base servers which are
using jbod from dell for the disk array. The quad proc opteron with duel
cores and 16gig of memory has been extremely fast (like 70%) over my older 4
proc 32 bit single core machine with 8 gig. But both are running postgres
and perform needed functionality. I would like to have redundant backups of
these as they are mission critical, but all in good time.

I'd recommend against Dell unless you're at a company that orders
computers by the hundred lot.  My experience with Dell has been that
unless you are a big customer you're just another number (a small one
at that) on a spreadsheet.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Dell Hardware Recommendations

2007-08-10 Thread Joel Fradkin

I know we bough the 4 proc opteron unit with the sas jbod from dell and it
has been extremely excellent in terms of performance.

Was like 3 times faster the our old dell 4 proc which had xeon processors.

The newer one has had a few issues (I am running redhat as4 since dell
supports it. I have had one kernel failure (but it has been up for like a
year). Other then that no issues a reboot fixed whatever caused the failure
and I have not seen it happen again and its been a few months.

I am definitely going dell for any other server needs their pricing is so
competitive now and the machines I bought both the 1u 2 proc and the larger
4 proc have been very good.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Merlin Moncure
Sent: Friday, August 10, 2007 1:31 PM
To: Arjen van der Meijden
Cc: Joe Uhl; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Dell Hardware Recommendations

On 8/9/07, Arjen van der Meijden <[EMAIL PROTECTED]> wrote:
> On 9-8-2007 23:50 Merlin Moncure wrote:
> > Where the extra controller especially pays off is if you have to
> > expand to a second tray.  It's easy to add trays but installing
> > controllers on a production server is scary.
>
> For connectivity-sake that's not a necessity. You can either connect
> (two?) extra MD1000's to your first MD1000 or you can use the second
> external SAS-port on your controller. Obviously it depends on the
> controller whether its good enough to just add the disks to it, rather
> than adding another controller for the second tray. Whether the perc5/e
> is good enough for that, I don't know, we've only equipped ours with a
> single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
> scaled pretty well going from a few to all 14 disks (+1 hotspare).

As it happens I will have an opportunity to test the dual controller
theory.   In about a week we are picking up another md1000 and will
attach it in an active/active configuration with various
hardware/software RAID configurations, and run a battery of database
centric tests.  Results will follow.

By the way, the recent dell severs I have seen are well built in my
opinion...better and cheaper than comparable IBM servers.  I've also
tested the IBM exp3000, and the MD1000 is cheaper and comes standard
with a second ESM.  In my opinion, the Dell 1U 1950 is extremely well
organized in terms of layout and cooling...dual power supplies, dual
PCI-E (one low profile), plus a third custom slot for the optional
perc 5/i which drives the backplane.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Massive performance issues

2005-09-01 Thread Joel Fradkin

Any chance it's a vacuum thing?
Or configuration (out of the box it needs adjusting)?

Joel Fradkin
 
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Merlin Moncure
Sent: Thursday, September 01, 2005 2:11 PM
To: Matthew Sackman
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Massive performance issues

> I'm having performance issues with a table consisting of 2,043,133
rows.
> The
> schema is:

> locality_1 has 16650 distinct values and locality_2 has 1156 distinct
> values.

Just so you know I have a 2GHz p4 workstation with similar size (2M
rows), several keys, and can find and fetch 2k rows based on 20k unique
value key in about 60 ms. (.06 seconds).

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-06-06 Thread Joel Fradkin

I did my own evaluation a few months back, because postgres was not cutting
it for me.
I found that postgres 8.0 (was what I was using at the time, now on 8.0.2)
out performed mysql on a optiplex with 2gig meg of memory. I had postgres
and mysql loaded and would run one server at a time doing testing. 
My tests included using aqua studios connection to both databases and .asp
page using odbc connections. There was not a huge difference, but I had
significant time in postgres and it was a little faster, so I just took new
approaches (flattened views,eliminated outer joins etc) to fixing the
issues.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jan Wieck
Sent: Monday, June 06, 2005 1:55 PM
To: PFC
Cc: Amit V Shah; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres

On 6/6/2005 2:12 PM, PFC wrote:

> 
>> Please pardon my ignorance, but from whatever I had heard, mysql was
>> supposedly always faster than postgres  Thats why I was so surprised

>> !!
> 
>   I heard a lot of this too, so much it seems common wisdom that
postgres 
> is slow... well maybe some old version was, but it's getting better at  
> every release, and the 8.0 really delivers...

The harder it is to evaluate software, the less often people reevaluate 
it and the more often people just "copy" opinions instead of doing an 
evaluation at all.

Today there are a gazillion people out there who "know" that MySQL is 
faster than PostgreSQL. They don't know under what circumstances it is, 
or what the word "circumstances" means in this context anyway. When you 
ask them when was the last time they actually tested this you get in 
about 99% of the cases an answer anywhere between 3 years and infinity 
(for all those who never did). The remaining 1% can then be reduced to 
an insignificant minority by asking how many concurrent users their test 
simulated.

Jan

-- 

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Joel Fradkin

I took an unbiased look and did some tests.
Objectively for me MYSQL was not an improvement for speed.
I had read the benchmarks in pcmagazine from a while back as well.

I did some tests using ODBC, and .net connections and also used aqua studios
(hooks up to both data bases) and found postgres a bit faster.

I did spend more time getting a feeling for setup on postgres, but I was at
a point of desperation as some queries were still too slow on postgres.

I ended up re-engineering my app to use simpler(flattened) data sets.
I still have a few I am working through, but all in all it is running better
then when I was on MSSQL, and MYSQL was just slower on the tests I did.

I loaded both MYSQL and postgres on both my 4 processor Dell running red hat
AS3 and Windows XP on a optiplex.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Amit V Shah
Sent: Tuesday, May 24, 2005 12:22 PM
To: 'Joshua D. Drake'
Cc: 'pgsql-performance@postgresql.org'
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres

Hi Josh,

Thanks for the prompt reply !! Actually migration is inevitable. We have a
totally messed up schema, not normalized and stuff like that. So the goal of
the migration is to get a new and better normalized schema. That part is
done already. Now the decision point is, should we go with postgres or
mysql. 

Thanks,
Amit
 
-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 24, 2005 1:15 PM
To: Amit V Shah
Cc: 'pgsql-performance@postgresql.org'
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres


> 
> I am not trying to start a mysql vs postgres war so please dont
> misunderstand me  I tried to look around for mysql vs postgres
articles,
> but most of them said mysql is better in speed. However those articles
were
> very old so I dont know about recent stage. Please comment !!!

It is my experience that MySQL is faster under smaller load scenarios. 
Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website.

It is also my experience that PostgreSQL is faster and more stable under
consistent and heavy load. I have customers you regularly are using up 
to 500 connections.

Note that alot of this depends on how your database is designed. Foreign 
keys slow things down.

I think it would be important for you to look at your overall goal of 
migration. MySQL is really not a bad product "IF" you are willing to 
work within its limitations.

PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a 
comparable feature set. Only you can decide if that is what you need.

Sincerely,

Joshua D. Drake
Command Prompt, Inc.


-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[PERFORM] cant seem to post the explain

2005-05-20 Thread Joel Fradkin









Sorry I tried a few times to break this email up (guess
there must be a size limit?).

Any one interested in seeing the explain for the speed of
many group by’s question just email me.

 

Basically the sql is built by a dynamic cube product from
data dynamics.

I can edit it prior to it running, but it runs very slow
even off the flattened file.

I am guessing the product needs the data from the sql I
supplied in previous post.

 

I don’t have any ideas to speed it up as I cant store
an aggregate flat table without updating it when updates and inserts are made
and that would be too time consuming.

Any ideas for how to approach getting the same data set in a
faster manner are greatly appreciated.

 

Joel Fradkin

[PERFORM] performance on a querry with many group by's any way to speed it up?

2005-05-20 Thread Joel Fradkin









explain analyze SELECT audit , store , question , month ,
year , week , weekday , myaudittotalscore , active , auditnum , answer ,
quarter , y_n , region , district , audittype , status , keyedby , questiondisplay
, qtext , qdescr , answerdisplay , answertext , customauditnum , dateaudittaken
, datecompleted , dateauditkeyed , datekeyingcomplete , section , createdby ,
division , auditscoredesc , locationnum , text_response , Sum(questionpointsavailable)
, Sum(pointsscored) 

from viwAuditCube where clientnum ='RSI' 

GROUP BY audit, store, question, month, year, week, weekday,
myaudittotalscore, active, auditnum, answer, quarter, y_n, region, district, audittype,
status, keyedby, questiondisplay, qtext, qdescr, answerdisplay, answertext, customauditnum,
dateaudittaken, datecompleted, dateauditkeyed, datekeyingcomplete, section, createdby,
division, auditscoredesc, locationnum, text_response ORDER BY audit, store,
question, month, year, week, weekday, myaudittotalscore, active, auditnum,
answer, quarter, y_n, region, district, audittype, status, keyedby, questiondisplay,
qtext, qdescr, answerdisplay, answertext, customauditnum, dateaudittaken, datecompleted,
dateauditkeyed, datekeyingcomplete, section, createdby, division, auditscoredesc,
locationnum, text_response

 

 

Joel Fradkin



 



Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305



 



[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
© 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the
intended recipient, please contact the sender by reply email and delete and
destroy all copies of the original message, including attachments.

Re: [PERFORM] Prefetch

2005-05-16 Thread Joel Fradkin

My only comment is what is the layout of your data (just one table with
indexes?).
I found on my date with dozens of joins my view speed was not good for me to
use, so I made a flat file with no joins and it flies.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Matt Olson
Sent: Monday, May 09, 2005 9:10 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Prefetch

I wanted to get some opinions about row prefetching.  AFAIK, there is no 
prefetching done by PostgreSQL; all prefetching is delegated to the
operating 
system.  

The hardware (can't say enough good things about it):

Athlon 64, dual channel
4GB ram
240GB usable 4 disk raid5 (ATA133)
Fedora Core 3
PostgreSQL 7.4.7

I have what is essentially a data warehouse of stock data.  Each day has 
around 30,000 records (tickers).  A typical operation is to get the 200 day 
simple moving average (of price) for each ticker and write the result to a 
summary table.  In running this process (Perl/DBI), it is typical to see 
70-80% I/O wait time with postgres running a about 8-9%.   If I run the next

day's date, the postgres cache and file cache is now populated with 199 days

of the needed data, postgres runs 80-90% of CPU and total run time is
greatly 
reduced.  My conclusion is that this is a high cache hit rate in action.  

I've done other things that make sense, like using indexes, playing with the

planner constants and turning up the postgres cache buffers.  

Even playing with extream hdparm read-ahead numbers (i.e. 64738), there is
no 
apparent difference in database performance.  The random nature of the I/O 
drops disk reads down to about 1MB/sec for the array.  A linear table scan 
can easily yield 70-80MB/sec on this system.  Total table size is usually 
around 1GB and with indexes should be able to fit completely in main memory.

Other databases like Oracle and DB2 implement some sort of row prefetch.
Has 
there been serious consideration of implementing something like a prefetch 
subsystem?  Does anyone have any opinions as to why this would be a bad idea

for postgres?  

Postges is great for a multiuser environment and OLTP applications.
However, 
in this set up, a data warehouse, the observed performance is not what I 
would hope for.  

Regards,

Matt Olson
Ocean Consulting
http://www.oceanconsulting.com/

---(end of broadcast)---
TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] ok you all win what is best opteron (I dont want a hosed system again)

2005-05-14 Thread Joel Fradkin

Thank you much for the info.
I will take a look. I think the prices I have been seeing may exclude us
getting another 4 proc box this soon. My boss asked me to get something in
the 15K range (I spent 30 on the Dell). 
The HP seemed to run around 30 but it had a lot more drives then the dell
(speced it with 14 10k drives).

I can and will most likely build it myself to try getting a bit more bang
for the buck and it is a second server so if it dies it should not be a
catastrophie.

FYI everyone using our system (after a week of dealing with many bugs) have
been saying how much they like the speed.
I did have to do a lot of creative ideas to get it working in a way that
appears faster to the client.
Stuff like the queries default to limit 50 and as they hit next I up the
limit (also a flag to just show all records and a count, it used to default
to that). The two worst queries (our case and audit applications) I created
denormalized files and maintain them through code. All reporting comes off
those and it is lightning fast.

I just want to say again thanks to everyone who has helped me in the past
few months.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: David Brown [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 13, 2005 7:03 PM
To: Joel Fradkin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] ok you all win what is best opteron (I dont want a
hosed system again)

Joel Fradkin wrote:

> Is the battery backed cache good or bad for Postgres?
>
Battery-backed avoids corruption if you have an unexpected power loss. 
It's considered mandatory with large-cache write-back controllers if you 
can't afford to lose any data.

> They are telling me I can only get a duel channel card if I want 
> hardware raid 10 on the 14 drives.
>
> I can get two cards but it has to be 7 and 7 (software raid?) which 
> does not sound like it fixes my single point of failure (one of the 
> listers mentioned my current system has 3 such single points).
>
Sounds like you need to try another vendor. Are you aiming for two RAID 
10 arrays or one RAID 10 and one RAID 5?

> Any of you hardware gurus spell out the optimal machine (I am hoping 
> to be around 15K, might be able to go more if it's a huge difference, 
> I spent 30k on the Dell).
>
> I do not have to go HP, and after seeing the fail ratio from Monarch 
> from one lister I am bit scared shopping there.
>
There's unlikely to be many common components between their workstation 
and server offerings. You would expect case, power, graphics, 
motherboard, storage controller and drives to all be different. But I'd 
challenge that 50% failure stat anyway. Which components exactly? Hard 
drives? Power supplies?

> Was there a conclusion on where is best to get one (I really want two 
> one for development too).
>
Almost anyone can build a workstation or tower server, and almost anyone 
else can service it for you. It gets trickier when you're talking 2U and 
especially 1U. But really, these too can be maintained by anyone 
competent. So I wonder about some people's obsession with 
vendor-provided service.

Realistically, most Opteron solutions will use a Tyan motherboard (no 
idea if this includes HP). For 4-way systems, there's currently only the 
S4882, which includes an LSI dual channel SCSI controller. Different 
vendors get to use different cases and cooling solutions and pick a 
different brand/model of hard drive, but that's about it.

Tyan now also sells complete servers - hardly a stretch seeing they 
already make the most important bit (after the CPU). Given the level of 
interest in this forum, here's their list of US resellers:

http://www.tyan.com/products/html/us_alwa.html

If it's a tower server, build it yourself or pay someone to do it. It 
really isn't challenging for anyone knowledgeable about hardware.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[PERFORM] ok you all win what is best opteron (I dont want a hosed system again)

2005-05-13 Thread Joel Fradkin









We are up and somewhat happy.

 

I have been following threads (in case you don’t know
I bought a 4 proc Dell recently) and the Opteron seems the way to go.

I just called HP for a quote, but don’t want to make
any mistakes.

 

Is the battery backed cache good or bad for Postgres?

 

They are telling me I can only get a duel channel card if I
want hardware raid 10 on the 14 drives.

I can get two cards but it has to be 7 and 7 (software
raid?) which does not sound like it fixes my single point of failure (one of
the listers mentioned my current system has 3 such single points).

 

Any of you hardware gurus spell out the optimal machine (I
am hoping to be around 15K, might be able to go more if it’s a huge
difference, I spent 30k on the Dell).

I do not have to go HP, and after seeing the fail ratio from
Monarch from one lister I am bit scared shopping there.

Was there a conclusion on where is best to get one (I really
want two one for development too).

 

 

Joel Fradkin



 



Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305



 



[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
© 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the
intended recipient, please contact the sender by reply email and delete and
destroy all copies of the original message, including attachments.

[PERFORM] Configing 8 gig box.

2005-05-09 Thread Joel Fradkin









Seems to be only using like 360 meg out of 7 gig free (odd
thing is I did see some used swap 4k out of 1.9) with a bunch of users (this
may be normal, but it is not going overly fast so thought I would ask).

Items I modified per commandprompt.coma nd watching this
list etc.

 

shared_buffers = 24576

work_mem = 32768

max_fsm_pages = 10

max_fsm_relations = 1500

fsync = true

wal_sync_method = open_sync

wal_buffers = 2048

checkpoint_segments = 100 

effective_cache_size =
524288

default_statistics_target =
250

 

Any help is appreciated.

 

 

 

Joel Fradkin



 



Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305



 



[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
© 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the
intended recipient, please contact the sender by reply email and delete and
destroy all copies of the original message, including attachments.

Re: [PERFORM] Final decision

2005-04-27 Thread Joel Fradkin

Just realize, you probably *don't* want to set that in postgresql.conf.
You just want to issue an "SET enable_seqscan TO off" before issuing one
of the queries that are mis-planned.

I believe all the tested queries (90 some odd views) saw an improvement.
I will however take the time to verify this and take your suggestion as I
can certainly put the appropriate settings in each as opposed to using the
config option, Thanks for the good advice (I believe Josh from
Commandprompt.com also suggested this approach and I in my lazy self some
how blurred the concept.)


Also, I second the notion of getting a confidentiality contract. There
have been several times where someone had a pathological case, and by
sending the data to someone (Tom Lane), they were able to track down and
fix the problem.

Excellent point, Our data is confidential, but I should write something to
allow me to ship concept without confidential, so in the future I can just
send a backup and not have it break our agreements, but allow minds greater
then my own to see, and feel my issues.


What do you mean by "blew up"? 
IIS testing was being done with an old 2300 and a optiplex both machines
reached 100%CPU utilization and the test suite (ASP code written in house by
one of programmers) was not returning memory correctly, so it ran out of
memory and died. Prior to death I did see cpu utilization on the 4proc linux
box running postgres fluctuate and at times hit the 100% level, but the
server seemed very stable. I did fix the memory usage of the suite and was
able to see 50 concurrent users with fairly high RPS especially on select
testing, the insert and update seemed to fall apart (many 404 errors etc)


I assume you have IIS on a different
machine than the database. Are you saying that the database slowed down
dramatically, or that the machine crashed, or just that the web
interface became unresponsive? Just the web interface.

It probably depends on what queries are being done, and what kind of
times you need. Usually the update machine needs the stronger hardware,
so that it can do the writing.

But it depends if you can wait longer to update data than to query data,
obviously the opposite is true. It all depends on load, and that is
pretty much application defined.

I am guessing our app is like 75% data entry and 25% reporting, but the
reporting is taking the toll SQL wise.

This was from my insert test with 15 users.
Test type: Dynamic 
 Simultaneous browser connections: 15 
 Warm up time (secs): 0 
 Test duration: 00:00:03:13 
 Test iterations: 200 
 Detailed test results generated: Yes
Response Codes 

 Response Code: 403 - The server understood the request, but is refusing to
fulfill it. 
  Count: 15 
  Percent (%): 0.29 
 
 
 Response Code: 302 - The requested resource resides temporarily under a
different URI (Uniform Resource Identifier). 
  Count: 200 
  Percent (%): 3.85 
 
 
 Response Code: 200 - The request completed successfully. 
  Count: 4,980 
  Percent (%): 95.86 
 
My select test with 25 users had this
Properties 

 Test type: Dynamic 
 Simultaneous browser connections: 25 
 Warm up time (secs): 0 
 Test duration: 00:00:06:05 
 Test iterations: 200 
 Detailed test results generated: Yes 
  
Summary 

 Total number of requests: 187 
 Total number of connections: 200 
  
 Average requests per second: 0.51 
 Average time to first byte (msecs): 30,707.42 
 Average time to last byte (msecs): 30,707.42 
 Average time to last byte per iteration (msecs): 28,711.44 
  
 Number of unique requests made in test: 1 
 Number of unique response codes: 1 
  
Errors Counts 

 HTTP: 0 
 DNS: 0 
 Socket: 26 
  
Additional Network Statistics 

 Average bandwidth (bytes/sec): 392.08 
  
 Number of bytes sent (bytes): 64,328 
 Number of bytes received (bytes): 78,780 
  
 Average rate of sent bytes (bytes/sec): 176.24 
 Average rate of received bytes (bytes/sec): 215.84 
  
 Number of connection errors: 0 
 Number of send errors: 13 
 Number of receive errors: 13 
 Number of timeout errors: 0 
  
Response Codes 

 Response Code: 200 - The request completed successfully. 
  Count: 187 
  Percent (%): 100.00 
 



Joel


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Final decision

2005-04-27 Thread Joel Fradkin

BTW, your performance troubleshooting will continue to be hampered if you 
can't share actual queries and data structure.   I strongly suggest that you

make a confidentiality contract with  a support provider so that you can
give them detailed (rather than general) problem reports.

I am glad to hear your perspective, maybe my rollout is not as off base as I
thought.

FYI it is not that I can not share specifics (I have posted a few table
structures and views here and on pgsql, I just can not backup the entire
database and ship it off to a consultant.

What I had suggested with Commandprompt was to use remote connectivity for
him to have access to our server directly. In this way I can learn by
watching what types of test he does and it allows him to do tests with our
data set.

Once I am in production that will not be something I want tests done on, so
it may have to wait until we get a development box with a similar deployment
(at the moment development is on a XP machine and production will be on
Linux (The 4 proc is linux and will be our production).

Thank you for letting me know what I can hope to see in the way of disk
access on the next hardware procurement, I may email you off list to get the
specific brands etc that you found that kind of through put with.




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Final decision

2005-04-27 Thread Joel Fradkin










Sorry I am using Redhat AS4 and postgres
8.0.2

Joel



 

You didnt tell us what OS
are you using, windows?

If you want good
performance you must install unix on  that machine,

 

---

[PERFORM] Final decision

2005-04-27 Thread Joel Fradkin









I spent a great deal of time over the past week looking
seriously at Postgres and MYSQL.

Objectively I am not seeing that much of an improvement in
speed with MYSQL, and we have a huge investment in postgrs.

So I am planning on sticking with postgres fro our
production database (going live this weekend).

 

Many people have offered a great deal of help and I
appreciate all that time and energy.

I did not find any resolutions to my issues with
Commandprompt.com (we only worked together 2.5 hours).

 

Most of my application is working about the same speed as
MSSQL server (unfortunately its twice the speed box, but as many have pointed
out it could be an issue with the 4 proc dell). I spent considerable time with
Dell and could see my drives are delivering 40 meg per sec.

 

Things I still have to make better are my settings in config,
I have it set to no merge joins and no seq scans.

I am going to have to use flattened history files for
reporting (I saw huge difference here the view for audit cube took 10 minutes
in explain analyze and the flattened file took under one second).

 

I understand both of these practices are not desirable, but
I am at a place where I have to get it live and these are items I could not
resolve.

I may try some more time with Commanpromt.com, or seek other
professional help.

 

In stress testing I found Postgres was holding up very well
(but my IIS servers could not handle much of a load to really push the server).

I have a few desktops acting as IIS servers at the moment
and if I pushed past 50 consecutive users it pretty much blew the server up.

On inserts that number was like 7 consecutive users and
updates was also like 7 users.

 

I believe that was totally IIS not postgres, but I am
curious as to if using postgres odbc will put more stress on the IIS side then
MSSQL did.

I did have a question if any folks are using two servers one
for reporting and one for data entry what system should be the beefier?

I have a 2proc machine I will be using and I can either put
Sears off by themselves on this machine or split up functionality and have one
for reporting and one for inserts and updates; so not sure which machine would
be best for which spot (reminder the more robust is a 4proc with 8 gigs and 2
proc is 4 gigs, both dells).

 

Thank you for any ideas in this arena.

 

Joel Fradkin

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Joel Fradkin

Tried changing the settings and saw no change in a test using asp.
The test does several selects on views and tables.
It actually seemed to take a bit longer.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: Marko Ristola [mailto:[EMAIL PROTECTED] 
Sent: Sunday, April 24, 2005 2:15 AM
To: Joel Fradkin
Cc: 'Mohan, Ross'; [EMAIL PROTECTED];
pgsql-performance@postgresql.org
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon

Here is, how you can receive all one billion rows with
pieces of 2048 rows. This changes PostgreSQL and ODBC behaviour:

Change ODBC data source configuration in the following way:

Fetch = 2048
UseDeclareFetch = 1

It does not create core dumps with 32 bit computers with billions of rows!
This is a bit slower than fetching all rows at once. Scalability means 
sometimes
a bit less speed :(

With UseDeclareFetch=1 you might get even 150 thousands rows per second.
With UseDeclareFetch=0 the backend might be able to send about 200 
thousands rows per
second.

So, these high numbers come, if all the results are already in memory, 
and no disc
accesses are needed. These are about the peak speeds with VARCHAR, 
without Unicode,
with Athlon64 home computer.

With sequential disc scan, more typical fetching
speed is about 50-100 thousands rows per second.

PostgreSQL ODBC row fetching speed is very good.
Perhaps with better discs, with RAID10, the current upper limit about 
200 thousands
rows per second could be achieved??

So the in memory examples show, that the hard disc is normally
the bottleneck. It is on the server side.
My experiments are done in Linux. In Windows, the speed might be a bit 
different
by a constant factor (algorithmically).

These speeds depend on very many factos even on sequential scan.
ODBC speed is affected by the number of columns fetched and the types of 
the columns.
Integers are processed faster than textual or date columns.

The network latency is decreased with UseDeclareFetc=1 by increasing the 
Fetch=2048
parameter: With Fetch=1 you get a bad performance with lots of rows, but 
if you fetch
more data from the server once per 2048 rows, the network latency 
affects only once for
the 2048 row block.

Regards,
Marko Ristola

Joel Fradkin wrote:

>Hate to be dumb, but unfortunately I am.
>
>Could you give me an idea what I should be using, or is there a good
>resource for me to check out.
>I have been spending so much time with config and moving data, converting
>etc, I never looked at the odbc settings (didn't even think about it until
>Josh brought it up). I did ask him for his advice, but would love a second
>opinion.
>
>Our data is a bit of a mixture, some records have text items most are
>varchars and integers with a bit of Booleans mixed in.
>
>I am running 8.0.2 so not sure if the protocol is ODBC or Postgres?
>
>Thanks for responding I appreciate any help 
>
>Joel Fradkin
> 
>-Original Message-
>From: [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED] On Behalf Of Mohan, Ross
>Sent: Thursday, April 21, 2005 10:01 AM
>To: [EMAIL PROTECTED]
>Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
>Xeon
>
>Joel, thanks. A couple of things jump out there for
>me, not a problem for a routine ODBC connection, but
>perhaps in the "lotsa stuff" context of your current
>explorations, it might be relevant?
>
>I am completely shooting from the hip, here, but...if
>it were my goose to cook, I'd be investigating
>
>Session("StringConn") =
>"DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=
;
>PWD=;ReadOnly=0;Protocol=6.4;
>
>|| Protocol? Is this related to version? is the driver wy old?
>
>
>FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
>ShowSystemTables=0;ConnSettings=;Fetch=100;
>
>||  Fetch great for OLTP, lousy for batch?
>
>
>Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;
>
>||  what ARE the datatypes and sizes in your particular case? 
>
>Debug=0;
>
>||  a run with debug=1 probably would spit up something interesting
>
>CommLog=0;Optimizer=1;
>
>||  Optimizer? that's a new one on me
>
>Ksqo=1;UseDeclare

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Joel Fradkin

Thanks we will try that, we are working on a test suit for the way our app
gets data (ODBC).
we plan to include updates, inserts, and selects and all three at once with
a log of the results.
Then we should use a stress test tool to see how it works with multiple
instances (I used Microsoft's tool last time I did stress testing).

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: Marko Ristola [mailto:[EMAIL PROTECTED] 
Sent: Sunday, April 24, 2005 2:15 AM
To: Joel Fradkin
Cc: 'Mohan, Ross'; [EMAIL PROTECTED];
pgsql-performance@postgresql.org
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon

Here is, how you can receive all one billion rows with
pieces of 2048 rows. This changes PostgreSQL and ODBC behaviour:

Change ODBC data source configuration in the following way:

Fetch = 2048
UseDeclareFetch = 1

It does not create core dumps with 32 bit computers with billions of rows!
This is a bit slower than fetching all rows at once. Scalability means 
sometimes
a bit less speed :(

With UseDeclareFetch=1 you might get even 150 thousands rows per second.
With UseDeclareFetch=0 the backend might be able to send about 200 
thousands rows per
second.

So, these high numbers come, if all the results are already in memory, 
and no disc
accesses are needed. These are about the peak speeds with VARCHAR, 
without Unicode,
with Athlon64 home computer.

With sequential disc scan, more typical fetching
speed is about 50-100 thousands rows per second.

PostgreSQL ODBC row fetching speed is very good.
Perhaps with better discs, with RAID10, the current upper limit about 
200 thousands
rows per second could be achieved??

So the in memory examples show, that the hard disc is normally
the bottleneck. It is on the server side.
My experiments are done in Linux. In Windows, the speed might be a bit 
different
by a constant factor (algorithmically).

These speeds depend on very many factos even on sequential scan.
ODBC speed is affected by the number of columns fetched and the types of 
the columns.
Integers are processed faster than textual or date columns.

The network latency is decreased with UseDeclareFetc=1 by increasing the 
Fetch=2048
parameter: With Fetch=1 you get a bad performance with lots of rows, but 
if you fetch
more data from the server once per 2048 rows, the network latency 
affects only once for
the 2048 row block.

Regards,
Marko Ristola

Joel Fradkin wrote:

>Hate to be dumb, but unfortunately I am.
>
>Could you give me an idea what I should be using, or is there a good
>resource for me to check out.
>I have been spending so much time with config and moving data, converting
>etc, I never looked at the odbc settings (didn't even think about it until
>Josh brought it up). I did ask him for his advice, but would love a second
>opinion.
>
>Our data is a bit of a mixture, some records have text items most are
>varchars and integers with a bit of Booleans mixed in.
>
>I am running 8.0.2 so not sure if the protocol is ODBC or Postgres?
>
>Thanks for responding I appreciate any help 
>
>Joel Fradkin
> 
>-Original Message-
>From: [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED] On Behalf Of Mohan, Ross
>Sent: Thursday, April 21, 2005 10:01 AM
>To: [EMAIL PROTECTED]
>Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
>Xeon
>
>Joel, thanks. A couple of things jump out there for
>me, not a problem for a routine ODBC connection, but
>perhaps in the "lotsa stuff" context of your current
>explorations, it might be relevant?
>
>I am completely shooting from the hip, here, but...if
>it were my goose to cook, I'd be investigating
>
>Session("StringConn") =
>"DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=
;
>PWD=;ReadOnly=0;Protocol=6.4;
>
>|| Protocol? Is this related to version? is the driver wy old?
>
>
>FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
>ShowSystemTables=0;ConnSettings=;Fetch=100;
>
>||  Fetch great for OLTP, lousy for batch?
>
>
>Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;
>
>||  what ARE the datatypes and sizes in your particular case? 
>
>Debug=0;
>
>||  a

[PERFORM] flattening the file might work for me here is the analyze.

2005-04-23 Thread Joel Fradkin









"Index Scan using ix_tblviwauditcube_clientnum on tblviwauditcube 
(cost=0.00..35895.75 rows=303982 width=708) (actual time=0.145..1320.432
rows=316490 loops=1)"

"  Index Cond: ((clientnum)::text = 'MSI'::text)"

"Total runtime: 1501.028 ms"

 

Joel Fradkin



 



Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305



 



[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
© 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the
intended recipient, please contact the sender by reply email and delete and
destroy all copies of the original message, including attachments.

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-23 Thread Joel Fradkin

I would very, very strongly encourage you to run multi-user tests before
deciding on mysql. Mysql is nowhere near as capable when it comes to
concurrent operations as PostgreSQL is. From what others have said, it
doesn't take many concurrent operations for it to just fall over. I
can't speak from experience because I avoid mysql like the plague,
though. :)

I am just testing the water so to speak, if it cant handle single user tests
then multiple user tests are kind of a waste of time.

I am trying to de-normalize my view into a table to see if I can get my app
to work. It is a good idea anyway but raises a ton of questions about
dealing with the data post a case being closed etc; also on multiple child
relationships like merchandise and payments etc.

I did do a test of all three (MSSQL, MYSQL,and postgres) in aqua studio ,
all on the same machine running the servers and found postgres beat out
MYSQL, but like any other test it may have been an issue with aqua studio
and mysql in any case I have not made a decision to use mysql I am still
researching fixes for postgres.

I am waiting to here back from Josh on using cursors and trying to flatten
long running views. 

I am a little disappointed I have not understood enough to get my analyzer
to use the proper plan, we had to set seqscan off to get the select from
response_line to work fast and I had to turn off merge joins to get assoc
list to work fast. Once I am up I can try to learn more about it, I am so
glad there are so many folks here willing to take time to educate us newb's.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Joel Fradkin

I just finished testing Postgres, MYSQL, and MSSQL on my machine (2 gigs
internal XP).

I have adjusted the postgres config to what I think is an ok place and have
mysql default and mssql default.

Using Aqua studio a program that hooks to all three I have found:

  Initial exec  Second exec  Returning 331,640 records on all 3 database
MSSQL468ms  16ms  2 mins 3  secs
MYSQL   14531ms   6625ms  2 mins 42 secs 
Postgr  52120ms  11702ms  2 mins 15 secs

Not sure if this proves your point on PGadmin versus MYSQL query tool versus
MSSQL Query tool, but it certainly seems encouraging.

I am going to visit Josh's tests he wanted me to run on the LINUX server.
 
Joel Fradkin
 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Joel Fradkin

One further question is: is this really a meaningful test?  I mean, in
production are you going to query 30 rows regularly? 

It is a query snippet if you will as the view I posted for audit and case
where tables are joined are more likely to be ran.

Josh and I worked over this until we got explain analyze on the linux box to
1 sec. I was just using this as a test as I don't have my views set up on
MYSQL.

So many of my reports pull huge data sets (comprised of normalized joins).
I am thinking I probably have to modify to using an non normalized table,
and Josh is sending me information on using cursors instead of selects.

And is the system always going to be used by only one user?  
No we have 400+ concurrent users

I guess the question is if this big select is representative of the load you
expect in production.
Yes we see many time on the two processor box running MSSQL large return
sets using 100%cpu for 5-30 seconds.

What happens if you execute the query more times?  Do the times stay the
same as the second run?
I will definitely have to pressure testing prior to going live in
production. I have not done concurrent tests as honestly single user tests
are failing, so multiple user testing is not something I need yet.

Joel

-- 
Alvaro Herrera (<[EMAIL PROTECTED]>)
"Use it up, wear it out, make it do, or do without"


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin

Here is the connect string I am using.
It could be horrid as I cut it from ODBC program.

Session("StringConn") =
"DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar
charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us
eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa
rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC
ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS
erverSidePrepare=0"

Joel Fradkin
 

-Original Message-
From: Mohan, Ross [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 9:42 AM
To: [EMAIL PROTECTED]
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the
like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely






---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin


I suspect he's using pgadmin.  
Yup I was, but I did try running on the linux box in psql, but it was
running to the screen and took forever because of that.

The real issue is returning to my app using ODBC is very slow (Have not
tested the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of
steam but been good until this year when we about doubled our demand by
adding sears as a client).

Using odbc to postgres on some of the views (Josh from Command is having me
do some very specific testing) is timing out with a 10 minute time limit.
These are pages that still respond using MSSQL (this is wehere production is
using the duel proc and the test is using the 4 proc).

I have a tool that hooks to all three databases so I can try it with that
and see if I get different responses.

Joel


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin

Why is MYSQL returning 360,000 rows, while Postgres is only returning
330,000? This may not be important at all, though.
I also assume you are selecting from a plain table, not a view.

Yes plain table. Difference in rows is one of the datasets had sears data in
it. It (speed differences found) is much worse on some of my views, which is
what forced me to start looking at other options.

I suppose knowing your work_mem, and shared_buffers settings would be
useful. I have posted my configs, but will add the Tampa to the bottom
again. My desktop has
# - Memory -

shared_buffers = 8000   # min 16, at least max_connections*2, 8KB
each
work_mem = 8000#1024# min 64, size in KB
maintenance_work_mem = 16384# min 1024, size in KB
#max_stack_depth = 2048 # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 3#2 # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 1000# min 100, ~50 bytes each
# - Planner Cost Constants -

effective_cache_size = 8#1000   # typically 8KB each
random_page_cost = 2# units are one sequential page fetch cost

How were you measuring "data retrieval time"? And how does this compare
to what you were measuring on the other machines? It might be possible
that what you are really measuring is just the time it takes psql to
load up all the data into memory, and then print it out. And since psql
defaults to measuring entry lengths for each column, this may not be
truly comparable.
It *looks* like it only takes 18s for postgres to get the data, but then
it is taking 72s to transfer the data to you. That would be network
latency, or something like that, not database latency.
And I would say that 18s is very close to 16 or 17 seconds.
This was ran on the machine with database (as was MYSQL and MSSQL).
The PG timing was from PGADMIN and the 18 secs was second run, first run was
Same time to return the data and 70 secs to do the first part like 147 secs
all told, compared to the 40 seconds first run of MYSQL and 56 Seconds
MSSQL. MYSQL was done in their query tool, it returns the rows as well and
MSSQL was done in their query analyzer. All three tools appear to use a
similar approach. Just an FYI doing an explain analyze of my problem view
took much longer then actually returning the data in MSSQL and MYSQL. I have
done extensive testing with MYSQL (just this table and two of my problem
views). I am not using the transactional version, because I need the best
speed.


I don't know what commands you were issuing, or how you measured,
though. You might be using some other interface (like ODBC), which I
can't say too much about.

John
=:->

This is the Linux box config.
# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have 
# to SIGHUP the postmaster for the changes to take effect, or use 
# "pg_ctl reload". Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.
# data_directory = 'ConfigDir'  # use data in another directory
#data_directory = '/pgdata/data'
# hba_file = 'ConfigDir/pg_hba.conf'# the host-based authentication file
# ident_file = 'ConfigDir/pg_ident.conf'  # the IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
# external_pid_file = '(none)'  # write an extra pid file


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

#listen_addresses = 'localhost' # what IP interface(s) to listen on; 
# defaults to localhost, '*' = any

listen_addresses = '*'
port = 5432
max_connections = 100
# note: increasing max_connections costs about 5

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Joel Fradkin

I did think of something similar just loading the data tables with junk
records and I may visit that idea with Josh.

I did just do some comparisons on timing of a plain select * from tbl where
indexed column = x and it was considerably slower then both MSSQL and MYSQL,
so I am still a bit confused. This still might be configuration issue (I ran
on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
same machines as far MSSQL, MYSQL, and Postgres.
I turned off postgres when running MYSQL and turned off MYSQL when running
postgres, MSSQL had one of the two running while I tested it.

For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
seconds second run.

MSSQL did it in 56 seconds first run and 16 seconds second run.

Postgres was on the second run
Total query runtime: 17109 ms.
Data retrieval runtime: 72188 ms.
331640 rows retrieved.

So like 89 on the second run.
The first run was 147 secs all told.

These are all on my 2 meg desktop running XP.
I can post the config. I noticed the postgres was using 70% of the cpu while
MSSQL was 100%.

Joel Fradkin
 

>I would of spent more $ with Command, but he does need my data base to help
>me and I am not able to do that.
>
>
...

What if someone were to write an anonymization script. Something that
changes any of the "data" of the database, but leaves all of the
relational information. It could turn all strings into some sort of
hashed version, so you don't give out any identifiable information.
It could even modify relational entries, as long as it updated both
ends, and this didn't affect the actual performance at all.

I don't think this would be very hard to write. Especially if you can
give a list of the tables, and what columns need to be modified.

Probably this would generally be a useful script to have for cases like
this where databases are confidential, but need to be tuned by someone else.

Would that be reasonable?
I would think that by renaming columns, hashing the data in the columns,
and renaming tables, most of the proprietary information is removed,
without removing the database information.

John
=:->



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Joel Fradkin

Sorry if you feel I am lashing out at a community.
Just to say it again, I am very appreciative of all the help everyone has
supplied.

I am running on more then just the 4 proc Dell (in fact my tests have been
mostly on desktops).

I have MSSQL running on a 2 proc dell which until my load has increased
(over aprx 2 years) it was just fine. I totally agree that there are better
solutions based on this lists comments, but I have all Dell hardware now and
resist trying different vendors just to suit Postgres. I was under the
impression there were still issues with 64bit postgres and Linux (or at
least were when I purchased). I believed I could make my next aquistion a
opteron based hardware.

Again I am not at all trying to critasize any one, so please except my
apology if I some how came across with that attitude. I am very disappointed
at this point. My views may not be that great (although I am not saying that
either), but they run ok on MSSQL and appear to run ok on MYSQL.

I wish I did understand what I am doing wrong because I do not wish to
revisit engineering our application for MYSQL.

I would of spent more $ with Command, but he does need my data base to help
me and I am not able to do that.

I agree testing the whole app is the only way to see and unfortunately it is
a time consuming bit. I do not have to spend 4k on MYSQL, that is if I want
to have their premium support. I can spend $250.00 a server for the
commercial license if I find the whole app does run well. I just loaded the
data last night and only had time to convert one view this morning. I am
sure it is something I do not understand and not a problem with postgres. I
also am willing to take time to get more knowledgeable, but my time is
running out and I feel honestly stupid.

I have been in the process of converting for over two months and have said
several times these lists are a godsend. 

It was never my intention to make you feel like I was flaming anyone
involved. On the contrary, I feel many have taken time to look at my
questions and given excellent advice. I know I check the archives so
hopefully that time will help others after me. 

I may yet find that MYSQL is not a good fit as well. I have my whole app
converted at this point and find pg works well for a lot of my usage.  

There are some key reporting views that need to retrieve many rows with many
joins that just take too long to pull the data. I told my boss just now that
if I try to de-normalize many of these data sets (like 6 main groups of data
that the reporting may work, but as is many of my web pages are timing out
(these are pages that still work on MSSQL and the 2 proc machine).

Thanks again for all the help and know I truly appreciate what time every
one has spent on my issues.

I may find that revisiting the datasets is a way to make PG work, or as you
mentioned maybe I can get some one with more knowledge to step in locally. I
did ask Tom if he knew of anyone, maybe some one else on the list is aware
of a professional in the Tampa FL area.

Realistically I don't think a 30k$ Dell is a something that needs to be
junked. I am pretty sure if I got MSSQL running on it, it would outperform
my two proc box. I can agree it may not have been the optimal platform. My
decision is not based solely on the performance on the 4 proc box.

Joel Fradkin
 

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 20, 2005 1:54 PM
To: Joel Fradkin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Joel,

> I did not see any marked improvement, but I don't think my issues are
> related to the hardware.

If you won't believe it, then we certainly can't convince you.  AFAIK your
bad 
view is a bad query plan made worse by the Dell's hardware problems.

> I am giving up on postgres and three developers two months of work and
> trying MYSQL.

I'd suggest testing your *whole* application and not just this one query.
And 
remember that you need to test InnoDB tables if you want transactions.

>
> I have posted several items and not got a response (not that I expect
folks
> to drop everything). I want to thank everyone who has been of help and
> there are several.

Hmmm ... I see about 25 responses to some of your posts on this list.   
Including ones by some of our head developers.   That's more than you'd get 
out of a paid MSSQL support contract, I know from experience.

If you want anything more, then you'll need a "do-or-die" contract with a 
support company. If your frustration is because you can't find this kind of 
help than I completely understand ... I have a waiting list for performance 
contracts myself.  (and, if you hired me the first thing I'd tell you is to 
junk the Dell)

> I really like the environment and feel I have learned a lot in the past
few
> months, but bottom lin

Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-20 Thread Joel Fradkin

There have been some discussions on this list and others in general about
Dell's version of RAID cards, and server support, mainly linux support.

I was pretty impressed with the Dell guy. He spent the day with me remotely
and went through my system 6650 with powervault. Changed my drives from ext3
to ext2 with no journaling checked all the drivers and such.

I did not see any marked improvement, but I dont think my issues are
related to the hardware.

I am giving up on postgres and three developers two months of work and
trying MYSQL.

I have posted several items and not got a response (not that I expect folks
to drop everything). I want to thank everyone who has been of help and there
are several.

It just is running way slow on several of my views. I tried them today in
MYSQL and found that the MYSQL was beating out my MSSQL.

On certain items I could get PG to work ok, but it never was faster the
MSSQL. On certain items it is taking several minutes compared to a few
seconds on MYSQL. 

I really like the environment and feel I have learned a lot in the past few
months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
better performance. I chose PG because MSSQL was 70K to license. I believe
the MYSQL will be 250.00 to license for us, but I may choose the 4k platinum
support just to feel safe about having some one to touch base with in the
event of an issue.

Again thanks to everyone who has answered my newb questions and helped me
get it on the 3 spindles and tweek the install. Commandpromt.com was a big
help and if I wanted to budget a bunch more $ and mostly if I was at liberty
to share my database with them they may of helped me get through all the
issues. I am not sure I am walking away feeling real good about postgres,
because it just should not take a rocket scientist to get it to work, and I
used to think I was fairly smart and could figure stuff out and I hate
admitting defeat (especially since we have everything working with postgres
now).


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin

pgAdmin III uses libpq, not the ODBC driver.

Sorry I am not too aware of all the semantics.
I guess the question is if it is normal to take 2 mins to get 160K of
records, or is there something else I can do (I plan on limiting the query
screens using limit and offset; I realize this will only be effective for
the early part of the returned record set, but I believe they don't page
through a bunch of records, they probably add search criteria). But for
reporting I will need to return all the records and this seems slow to me
(but it might be in line with what I get now; I will have to do some
benchmarking).

The application is a mixture of .net and asp and will soon have java.
So I am using the .net library for the .net pages and the ODBC driver for
the asp pages.

I did find using a view for the location join sped up the query a great
deal, I will have to see if there are other places I can use that thinking
(instead of joining on the associate table and its dependants I can just
join on a view of that data, etc).

Basically I have a view that does a join from location to district, region
and division tables. The old viwassoclist had those joined to the assoc
table in the viwassoclist, I changed it to use the view I created where the
tables were joined to the location table and in assoclist I just join to the
location view. This really made a huge difference in speed.

Regards, Dave


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

FW: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin

Sorry if this posts twice I posted and did not see it hit the list.

What are the statistics
for tbljobtitle.id and tbljobtitle.clientnum 
I added default_statistics_target = 250 to the config and re-loaded the data
base. If that is what you mean?

--- how many distinct values of each, 

tbljobtitle.id 6764 for all clients 1018 for SAKS
tbljobtitle.clientnum 237 distinct clientnums just 1 for SAKS

and are the distributions skewed to a few popular values?
There are 3903 distinct values for jobtitle

Not sure if I answered the questions, let me know if you need more info.
It appears there are 1018 job titles in the table for saks and 6764 for all
the clients. There can be more values as presentation layer can have more
then one value for an id. SAKS is not using presentation layer yet as there
are only 1018 distinct values 1 for each id.

Joel




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

FW: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin


What are the statistics
for tbljobtitle.id and tbljobtitle.clientnum 
I added default_statistics_target = 250 to the config and re-loaded the data
base. If that is what you mean?

--- how many distinct values of each, 

tbljobtitle.id 6764 for all clients 1018 for SAKS
tbljobtitle.clientnum 237 distinct clientnums just 1 for SAKS

and are the distributions skewed to a few popular values?
There are 3903 distinct values for jobtitle

Not sure if I answered the questions, let me know if you need more info.
It appears there are 1018 job titles in the table for saks and 6764 for all
the clients. There can be more values as presentation layer can have more
then one value for an id. SAKS is not using presentation layer yet as there
are only 1018 distinct values 1 for each id.

Joel




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

FW: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin

Another odd thing is when I tried turning off merge joins on the XP desktop
It took 32 secs to run compared to the 6 secs it was taking.
On the Linux (4proc box) it is now running in 3 secs with the mergejoins
turned off.

Unfortunately it takes over 2 minutes to actually return the 160,000+ rows.
I am guessing that is either network (I have gig cards on a LAN) or perhaps
the ODBC driver (using PGADMIN III to do the select).

I tried to run on psql on the server but it was putting it out to more.
If I do it and use > test.txt will it run it all out so I can get a time?
Does it display the time anywhere like in pgadminIII?


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] speed of querry?

2005-04-15 Thread Joel Fradkin



Joel Fradkin
 
Turning off merg joins seems to of done it but what do I need to do so I am
not telling the system explicitly not to use them, I must be missing some
setting?

On linux box.

explain analyze select * from viwassoclist where clientnum ='SAKS'

"Hash Join  (cost=988.25..292835.36 rows=15773 width=113) (actual
time=23.514..3024.064 rows=160593 loops=1)"
"  Hash Cond: ("outer".locationid = "inner".locationid)"
"  ->  Hash Left Join  (cost=185.57..226218.77 rows=177236 width=75) (actual
time=21.147..2221.098 rows=177041 loops=1)"
"Hash Cond: (("outer".jobtitleid = "inner".id) AND
(("outer".clientnum)::text = ("inner".clientnum)::text))"
"->  Seq Scan on tblassociate a  (cost=0.00..30851.25 rows=177236
width=53) (actual time=0.390..1095.385 rows=177041 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"->  Hash  (cost=152.55..152.55 rows=6604 width=37) (actual
time=20.609..20.609 rows=0 loops=1)"
"  ->  Seq Scan on tbljobtitle jt  (cost=0.00..152.55 rows=6604
width=37) (actual time=0.033..12.319 rows=6603 loops=1)"
"Filter: (1 = presentationid)"
"  ->  Hash  (cost=801.54..801.54 rows=454 width=49) (actual
time=2.196..2.196 rows=0 loops=1)"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..801.54 rows=454 width=49) (actual time=0.111..1.755 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"Total runtime: 3120.366 ms"

here are the table defs and view if that helps. I posted the config a while
back, but can do it again if you need to see it.

CREATE OR REPLACE VIEW viwassoclist AS 
 SELECT a.clientnum, a.associateid, a.associatenum, a.lastname, a.firstname,
jt.value AS jobtitle, l.name AS "location", l.locationid AS mainlocationid,
l.divisionid, l.regionid, l.districtid, (a.lastname::text || ', '::text) ||
a.firstname::text AS assocname, a.isactive, a.isdeleted
   FROM tblassociate a
   LEFT JOIN tbljobtitle jt ON a.jobtitleid = jt.id AND jt.clientnum::text =
a.clientnum::text AND 1 = jt.presentationid
   JOIN tbllocation l ON a.locationid = l.locationid AND l.clientnum::text =
a.clientnum::text;

CREATE TABLE tblassociate
(
  clientnum varchar(16) NOT NULL,
  associateid int4 NOT NULL,
  associatenum varchar(10),
  firstname varchar(50),
  middleinit varchar(5),
  lastname varchar(50),
  ssn varchar(18),
  dob timestamp,
  address varchar(100),
  city varchar(50),
  state varchar(50),
  country varchar(50),
  zip varchar(10),
  homephone varchar(14),
  cellphone varchar(14),
  pager varchar(14),
  associateaccount varchar(50),
  doh timestamp,
  dot timestamp,
  rehiredate timestamp,
  lastdayworked timestamp,
  staffexecid int4,
  jobtitleid int4,
  locationid int4,
  deptid int4,
  positionnum int4,
  worktypeid int4,
  sexid int4,
  maritalstatusid int4,
  ethnicityid int4,
  weight float8,
  heightfeet int4,
  heightinches int4,
  haircolorid int4,
  eyecolorid int4,
  isonalarmlist bool NOT NULL DEFAULT false,
  isactive bool NOT NULL DEFAULT true,
  ismanager bool NOT NULL DEFAULT false,
  issecurity bool NOT NULL DEFAULT false,
  createdbyid int4,
  isdeleted bool NOT NULL DEFAULT false,
  militarybranchid int4,
  militarystatusid int4,
  patrontypeid int4,
  identificationtypeid int4,
  workaddress varchar(200),
  testtypeid int4,
  testscore int4,
  pin int4,
  county varchar(50),
  CONSTRAINT pk_tblassociate PRIMARY KEY (clientnum, associateid),
  CONSTRAINT ix_tblassociate UNIQUE (clientnum, associatenum)
)
CREATE TABLE tbljobtitle
(
  clientnum varchar(16) NOT NULL,
  id int4 NOT NULL,
  value varchar(50),
  code varchar(16),
  isdeleted bool DEFAULT false,
  presentationid int4 NOT NULL DEFAULT 1,
  CONSTRAINT pk_tbljobtitle PRIMARY KEY (clientnum, id, presentationid)
)
CREATE TABLE tbllocation
(
  clientnum varchar(16) NOT NULL,
  locationid int4 NOT NULL,
  districtid int4 NOT NULL,
  regionid int4 NOT NULL,
  divisionid int4 NOT NULL,
  locationnum varchar(8),
  name varchar(50),
  clientlocnum varchar(50),
  address varchar(100),
  address2 varchar(100),
  city varchar(50),
  state varchar(2) NOT NULL DEFAULT 'zz'::character varying,
  zip varchar(10),
  countryid int4,
  phone varchar(15),
  fax varchar(15),
  payname varchar(40),
  contact char(36),
  active bool NOT NULL DEFAULT true,
  coiprogram text,
  coilimit text,
  coiuser varchar(255),
  coidatetime varchar(32),
  ec_note_field varchar(1050),
  locationtypeid int4,
  open_time timestamp,
  close_time timestamp,
  insurance_loc_id varchar(50),
  lpregionid int4,
  sic int4,
  CONSTRAINT pk_tbllocation PRIMARY KEY (clientnum, locationid),
  CONSTRAINT ix_tbllocation_1 UNIQUE (clientnum, locationnum, name),
  CONSTRAINT ix_tbl

Re: [PERFORM] speed of querry?

2005-04-15 Thread Joel Fradkin

It is still slower on the Linux box. (included is explain with SET
enable_seqscan = off;
explain analyze select * from viwassoclist where clientnum ='SAKS') See
below.

I did a few other tests (changing drive arrays helped by 1 second was slower
on my raid 10 on the powervault).

Pulling just raw data is much faster on the Linux box.
"Seq Scan on tblresponse_line  (cost=1.00..100089717.78 rows=4032078
width=67) (actual time=0.028..4600.431 rows=4032078 loops=1)"
"Total runtime: 6809.399 ms"
Windows box
"Seq Scan on tblresponse_line  (cost=0.00..93203.68 rows=4031968 width=67)
(actual time=16.000..11316.000 rows=4031968 loops=1)"
"Total runtime: 16672.000 ms"

I am going to reload the data bases, just to see what I get.
I am thinking I may have to flatten the files for postgres (eliminate joins
of any kind for reporting etc). Might make a good deal more data, but I
think from the app's point of view it is a good idea anyway, just not sure
how to handle editing.

Joel Fradkin
 
"Merge Join  (cost=49697.60..50744.71 rows=14987 width=113) (actual
time=11301.160..12171.072 rows=160593 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=788.81..789.89 rows=432 width=49) (actual
time=3.318..3.603 rows=441 loops=1)"
"Sort Key: l.locationid"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..769.90 rows=432 width=49) (actual time=0.145..2.283 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=48908.79..49352.17 rows=177352 width=75) (actual
time=11297.774..11463.780 rows=160594 loops=1)"
"Sort Key: a.locationid"
"->  Merge Right Join  (cost=26247.95..28942.93 rows=177352
width=75) (actual time=8357.010..9335.362 rows=177041 loops=1)"
"  Merge Cond: ((("outer".clientnum)::text =
"inner"."?column10?") AND ("outer".id = "inner".jobtitleid))"
"  ->  Index Scan using ix_tbljobtitle_id on tbljobtitle jt
(cost=0.00..243.76 rows=6604 width=37) (actual time=0.122..12.049 rows=5690
loops=1)"
"Filter: (1 = presentationid)"
"  ->  Sort  (cost=26247.95..26691.33 rows=177352 width=53)
(actual time=8342.271..8554.943 rows=177041 loops=1)"
"Sort Key: (a.clientnum)::text, a.jobtitleid"
"->  Index Scan using ix_associate_clientnum on
tblassociate a  (cost=0.00..10786.17 rows=177352 width=53) (actual
time=0.166..1126.052 rows=177041 loops=1)"
"  Index Cond: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 12287.502 ms"


This is above and beyond toying with the column statistics.  You
are basically telling the planner to use an index.  Try this,
and post the EXPLAIN ANALYZE for the seqscan = off case on the
slow box if it doesn't speed things up for you.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

Well so far I have 1.5 hours with commandpromt.com and 8 + hours with Dell
and have not seen any changes in the speed of my query.

I did move the data base to the raid 5 drives and did see a 1 second
improvement from 13 secs to 12 secs (keep in mind it runs in 6 on the
optiplex).

The dell guy ran Bonie and found 40meg per sec read/write speed for the
arrays.

He also installed version 8.0.2 (went fine on AS4 he had to uninstall 8.0.1
first).

He is going to get a 6650 in his test lab to see what he can fugure out.
I will say both commandprompt.com and Dell have been very professional and I
am impressed at the level of support available for Redhat from Dell and
postgres. As always I still feel this list has been my most useful asset,
but I am glad there are folks to call on. I am trying to go live soon and
need to get this resolved.

I told the guy from Dell it makes no sense that a windows 2.4 single proc
with 750 meg of ram can go faster then a 4 proc (3.ghz) 8 gig machine.
Both databases were restored from the same file. Same view etc.

Config files are set the same except for amount of cached ram, although
Commandprompt.com had me adjust a few items that should help going into
production, put planning stuff is basicly the same.

This view returns in 3 secs on MSSQL server on the optiplex (750 meg 2.4
box); and 6 secs using postgres on windows and 12-13 secs on the 4 processor
box. Needless to say I am very frustrated. Maybe Dell will turn up something
testing in their lab. It took a bit of perseverance to get to the right guy
at Dell (the first guy actually told me to load it all on a like machine and
if it was very much slower on my original they would pursue it otherwise it
was not an issue. I was like the machine cost 30K you going to send me one
to test that. But seriously I am open to trying anything (loading AS3, using
postgres 7.4)? The fellow at Dell does not think it is a hardware problem,
so if it is Linux (could very well be, but he seemed very sharp and did not
come up with anything yet) or postgres config (again Josh at
commandprompt.com was very sharp) then what do I do now to isolate the
issue? At least they are loading one in the lab (in theory, I cant send them
my database, so who knows what they will test). Dell changed the file system
to ext2 is that going to bite me in the butt? It did not seem to change the
speed of my explain analyze.

Joel Fradkin

Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> Basically it tells postgres how many values should it keep for
> statistics per column.  The config default_statistics_target
> is the default (= used when creating table) and ALTER... is
> a way to change it later.

Not quite.  default_statistics_target is the value used by ANALYZE for
any column that hasn't had an explicit ALTER SET STATISTICS done on it.
So you can change default_statistics_target and that will affect
existing tables.

(It used to work the way you are saying, but that was a few releases
back...)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

I did as described to alter table and did not see any difference in speed.
I am trying to undo the symbolic link to the data array and set it up on
raid 5 disks in the machine just to test if there is an issue with the
config of the raid 10 array or a problem with the controller.

I am kinda lame at Linux so not sure I have got it yet still testing.
Still kind puzzled why it chose tow different option, but one is running
windows version of postgres, so maybe that has something to do with it.

The data bases and configs (as far as page cost) are the same.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 14, 2005 11:21 AM
To: Dawid Kuroczko
Cc: Joel Fradkin; PERFORM
Subject: Re: [PERFORM] speed of querry? 

Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> Basically it tells postgres how many values should it keep for
> statistics per column.  The config default_statistics_target
> is the default (= used when creating table) and ALTER... is
> a way to change it later.

Not quite.  default_statistics_target is the value used by ANALYZE for
any column that hasn't had an explicit ALTER SET STATISTICS done on it.
So you can change default_statistics_target and that will affect
existing tables.

(It used to work the way you are saying, but that was a few releases
back...)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

Josh from commandprompt.com had me alter the config to have
default_statistics_target = 250

Is this somehow related to what your asking me to do?
I did do an analyze, but have only ran the viw a few times.

Joel Fradkin

-Original Message-
From: Dawid Kuroczko [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 14, 2005 9:21 AM
To: Joel Fradkin
Cc: PostgreSQL Perform
Subject: Re: [PERFORM] speed of querry?

On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote:
> I have done a vacuum and a vacuum analyze.
> I can try again for kicks, but it is not in production so no new records
are
> added and vacuum analyze is ran after any mods to the indexes.
> 
> I am still pursuing Dell on why the monster box is so much slower then the
> desktop as well.

First thing:  Do something like:
ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50;
make it for each column used, make it even higher than 50 for
many-values columns.
THEN make VACUUM ANALYZE;

Then do a query couple of times (EXPLAIN ANALYZE also :)), then do:
SET enable_seqscan = off;
and rerun the query -- if it was significantly faster, you will want to do:
SET enable_seqscan = on;
and tweak:
SET random_page_cost = 2.1;
...and play with values.  When you reach the random_page_cost which
suits your data, you will want to put it into postgresql.conf

I am sorry if it is already known to you. :)  Also, it is a rather
simplistic
approach to tuning PostgreSQL but it is worth doing.  Especially the
statistics part. :)

   Regards,
   Dawid

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] speed of querry?

2005-04-14 Thread Joel Fradkin

I have done a vacuum and a vacuum analyze.
I can try again for kicks, but it is not in production so no new records are
added and vacuum analyze is ran after any mods to the indexes.

I am still pursuing Dell on why the monster box is so much slower then the
desktop as well.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 14, 2005 1:47 AM
To: Richard Huxton
Cc: Joel Fradkin; PostgreSQL Perform
Subject: Re: [PERFORM] speed of querry? 

Richard Huxton  writes:
> In the first, we match outer.clientnum to inner.clientnum, in the second 
> it's "?column10?" - are you sure the query was identical in each case. 
> I'm guessing the unidentified column in query 2 is the reason for the 
> sort a couple of lines below it, which seems to take up a large chunk of 
> time.

The "?column10?" is because EXPLAIN isn't excessively bright about
reporting references to outputs of lower plan nodes.  (Gotta fix that
sometime.)  The real point here is that the planner thought that a scan
plus sort would be faster than scanning an index that exactly matched
the sort order the Merge Join needed ... and it was wrong :-(

So this is just the usual sort of question of "are your stats up to
date, maybe you need to increase stats targets, or else play with
random_page_cost, etc" ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] speed of querry?

2005-04-13 Thread Joel Fradkin

are you sure the query was identical in each case. 

I just ran a second time same results ensuring that the query is the same.
Not sure why it is doing a column10 thing. Any ideas what to look for?
Both data bases are a restore from the same backup file.

One is running redhat the other XP, I believe both are the same version of
postgres except for the different platform (8.0.1 I am pretty sure).

I just spent the morning with Dell hoping for some explanation from them.
They said I had to have the database on the same type of OS and hardware for
them to think the issue was hardware. They are escalating to the software
group.

I did a default Redhat install so it very well may be an issue with my lack
of knowledge on Linux.

He did mention by default the Perc4 do cache, so I may need to visit the
data center to set the percs to not cache.

--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[PERFORM] speed of querry?

2005-04-13 Thread Joel Fradkin

I must be missing something important, because I am just not seeing why this
query is slower on a 4 processor 8 gig machine running redhat AS4.

The SQL:
explain analyze SELECT a.clientnum, a.associateid, a.associatenum,
a.lastname, a.firstname, jt.value AS jobtitle, l.name AS "location",
l.locationid AS mainlocationid, l.divisionid, l.regionid, l.districtid,
(a.lastname::text || ', '::text) || a.firstname::text AS assocname,
a.isactive, a.isdeleted
   FROM tblassociate a
   left JOIN tbljobtitle jt ON a.jobtitleid = jt.id AND jt.clientnum::text =
a.clientnum::text AND 1 = jt.presentationid
   JOIN tbllocation l ON a.locationid = l.locationid AND l.clientnum::text =
a.clientnum::text
where a.clientnum = 'SAKS'; 

Machine 1 my desktop:
"Merge Join  (cost=74970.51..75975.46 rows=8244 width=113) (actual
time=5141.000..6363.000 rows=160593 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=656.22..657.11 rows=354 width=49) (actual
time=16.000..16.000 rows=441 loops=1)"
"Sort Key: l.locationid"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..641.23 rows=354 width=49) (actual time=0.000..0.000 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=74314.29..74791.06 rows=190710 width=75) (actual
time=5125.000..5316.000 rows=160594 loops=1)"
"Sort Key: a.locationid"
"->  Merge Right Join  (cost=0.00..52366.50 rows=190710 width=75)
(actual time=16.000..1973.000 rows=177041 loops=1)"
"  Merge Cond: ((("outer".clientnum)::text =
("inner".clientnum)::text) AND ("outer".id = "inner".jobtitleid))"
"  ->  Index Scan using ix_tbljobtitle_id on tbljobtitle jt
(cost=0.00..244.75 rows=6622 width=37) (actual time=0.000..16.000 rows=5690
loops=1)"
"Filter: (1 = presentationid)"
"  ->  Index Scan using ix_tblassoc_jobtitleid on tblassociate a
(cost=0.00..50523.83 rows=190710 width=53) (actual time=0.000..643.000
rows=177041 loops=1)"
"Index Cond: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 6719.000 ms"

Test Linux machine:
"Merge Join  (cost=48126.04..49173.57 rows=15409 width=113) (actual
time=11832.165..12678.025 rows=160593 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=807.64..808.75 rows=443 width=49) (actual
time=2.418..2.692 rows=441 loops=1)"
"Sort Key: l.locationid"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..788.17 rows=443 width=49) (actual time=0.036..1.677 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=47318.40..47758.44 rows=176015 width=75) (actual
time=11829.660..12002.746 rows=160594 loops=1)"
"Sort Key: a.locationid"
"->  Merge Right Join  (cost=24825.80..27512.71 rows=176015
width=75) (actual time=8743.848..9750.775 rows=177041 loops=1)"
"  Merge Cond: ((("outer".clientnum)::text =
"inner"."?column10?") AND ("outer".id = "inner".jobtitleid))"
"  ->  Index Scan using ix_tbljobtitle_id on tbljobtitle jt
(cost=0.00..239.76 rows=6604 width=37) (actual time=0.016..11.323 rows=5690
loops=1)"
"Filter: (1 = presentationid)"
"  ->  Sort  (cost=24825.80..25265.84 rows=176015 width=53)
(actual time=8729.320..8945.292 rows=177041 loops=1)"
"Sort Key: (a.clientnum)::text, a.jobtitleid"
"->  Index Scan using ix_associate_clientnum on
tblassociate a  (cost=0.00..9490.20 rows=176015 width=53) (actual
time=0.036..1071.867 rows=177041 loops=1)"
"  Index Cond: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 12802.019 ms"

I tried to remove the left outer thinking it would speed it up, and it used
a seq search on tblassoc and ran 2 times slower.


Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Is there somthing I need to do on my production server?

2005-04-11 Thread Joel Fradkin



Here is the config for the AS4 server.
# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.
# data_directory = 'ConfigDir'  # use data in another directory
#data_directory = '/pgdata/data'
# hba_file = 'ConfigDir/pg_hba.conf'# the host-based authentication file
# ident_file = 'ConfigDir/pg_ident.conf'  # the IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
# external_pid_file = '(none)'  # write an extra pid file


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

#listen_addresses = 'localhost' # what IP interface(s) to listen on;
# defaults to localhost, '*' = any

listen_addresses = '*'
port = 5432
max_connections = 100
# note: increasing max_connections costs about 500 bytes of shared
# memory per connection slot, in addition to costs from
shared_buffers
# and max_locks_per_transaction.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#rendezvous_name = ''   # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60# 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false

#---
# RESOURCE USAGE (except WAL)
#---

# - Memory -

shared_buffers = 12288 #5000  min 16, at least max_connections*2, 8KB each
#work_mem = 1024# min 64, size in KB
work_mem = 16384 # 8192
#maintenance_work_mem = 16384   # min 1024, size in KB
#max_stack_depth = 2048 # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 10  #3  # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1500 #1000  # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1   # 0-1 credits
#vacuum_cost_pagE_miss = 10 # 0-1 credits
#vacuum_cost_page_dirty = 20# 0-1 credits
#vacuum_cost_limit = 200# 0-1 credits

# - Background writer -

#bgwriter_delay = 200   # 10-1 milliseconds between rounds
#bgwriter_percent = 1   # 0-100% of dirty buffers in each round
#bgwriter_maxpages = 100# 0-1000 buffers max per round


#---
# WRITE AHEAD LOG
#---

# - Settings -
fsync = true# turns forced synchronization on or off
wal_sync_method = open_sync# fsync  # the default varies across
platforms:
# fsync, fdatasync, open_sync, or
open_datasync
wal_buffers = 2048#8# min 4, 8KB each
#commit_delay = 0   # range 0-10, in microseconds
#commit_siblings = 5# range 1-1000

# - Checkpoints -

checkpoint_segments = 100 #3# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300   # range 30-3600, in seconds
#checkpoint_warning = 30# 0 is off, in seconds

# - Archiving -

#archive_command = ''   # command to use to archive a logfile
segment


#---

[PERFORM] Is there somthing I need to do on my production server?

2005-04-11 Thread Joel Fradkin

I am running 8.0.1 on a desktop xp system and a AS4 redhat system.
The redhat will be my production server in a week or so and it is returning
slower the my desk top?
I understand about the perc cards on the Dell (redhat) but my Dell 2 proc
box runs much faster (MSSQL) then my desktop, so I am wondering if I messed
up Linux or have a postgres config issue. 

On my desktop (1 proc 2 gigs of memor) I get:
"Merge Join  (cost=7135.56..7296.25 rows=7906 width=228) (actual
time=5281.000..6266.000 rows=160593 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=955.78..957.07 rows=514 width=79) (actual
time=0.000..0.000 rows=441 loops=1)"
"    Sort Key: l.locationid"
"    ->  Index Scan using ix_location on tbllocation l 
(cost=0.00..932.64 rows=514 width=79) (actual time=0.000..0.000 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=6179.77..6187.46 rows=3076 width=173) (actual
time=5281.000..5424.000 rows=160594 loops=1)"
"    Sort Key: a.locationid"
"    ->  Merge Left Join  (cost=154.41..6001.57 rows=3076 width=173)
(actual time=94.000..2875.000 rows=177041 loops=1)"
"  Merge Cond: ((("outer".clientnum)::text =
"inner"."?column4?") AND ("outer".jobtitleid = "inner".id))"
"  ->  Index Scan using ix_tblassoc_jobtitleid on tblassociate
a  (cost=0.00..5831.49 rows=3076 width=134) (actual time=0.000..676.000
rows=177041 loops=1)"
"    Index Cond: ((clientnum)::text = 'SAKS'::text)"
"  ->  Sort  (cost=154.41..154.50 rows=34 width=67) (actual
time=78.000..204.000 rows=158255 loops=1)"
"    Sort Key: (jt.clientnum)::text, jt.id"
"    ->  Seq Scan on tbljobtitle jt  (cost=0.00..153.55
rows=34 width=67) (actual time=0.000..31.000 rows=6603 loops=1)"
"  Filter: (1 = presentationid)"
"Total runtime: 6563.000 ms"
On my production (4 proc, 8 gigs of memory)
"Merge Join  (cost=69667.87..70713.46 rows=15002 width=113) (actual
time=12140.091..12977.841 rows=160593 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=790.03..791.11 rows=433 width=49) (actual
time=2.936..3.219 rows=441 loops=1)"
"    Sort Key: l.locationid"
"    ->  Index Scan using ix_location on tbllocation l 
(cost=0.00..771.06 rows=433 width=49) (actual time=0.062..1.981 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=68877.84..69320.17 rows=176933 width=75) (actual
time=12137.081..12305.125 rows=160594 loops=1)"
"    Sort Key: a.locationid"
"    ->  Merge Right Join  (cost=46271.48..48961.53 rows=176933
width=75) (actual time=9096.623..10092.311 rows=177041 loops=1)"
"  Merge Cond: ((("outer".clientnum)::text =
"inner"."?column10?") AND ("outer".id = "inner".jobtitleid))"
"  ->  Index Scan using ix_tbljobtitle_id on tbljobtitle jt 
(cost=0.00..239.76 rows=6604 width=37) (actual time=0.068..12.157 rows=5690
loops=1)"
"    Filter: (1 = presentationid)"
"  ->  Sort  (cost=46271.48..46713.81 rows=176933 width=53)
(actual time=9081.546..9295.495 rows=177041 loops=1)"
"    Sort Key: (a.clientnum)::text, a.jobtitleid"
"    ->  Seq Scan on tblassociate a  (cost=0.00..30849.25
rows=176933 width=53) (actual time=543.931..1674.518 rows=177041 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 13101.402 ms"
 
I am at a bit of a loss as I would have thought my soon to be production box
should be blowing away my desktop?
 
Also stupid newb question?
I am a bit confused looking at the results of explain analyze.
I would have thought the explain analyze select * from viwassoclist where
clientnum ='SAKS'
Would first limit the result set by clientnum = SAKS is this the bottom
line?
"    ->  Seq Scan on tblassociate a  (cost=0.00..30849.25
rows=176933 width=53) (actual time=543.931..1674.518 rows=177041 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
which if I understand this (not saying I do) is taking actual
time=543.931..1674.518 rows=177041 loops=1
this means 1 loop takes between 543 and 1674 milisecs to return 177041 rows?
And the analyzer thought I would take cost=0.00..30849.25?

Re: [PERFORM] Any way to speed this up?

2005-04-07 Thread Joel Fradkin

Here is the result after putting it back to 4 the original value (I had done
that prior to your suggestion of using 2 or 3) to see what might change.
I also vacummed and thought I saw records deleted in associate, which I
found odd as this is a test site and no new records were added or deleted.

"Merge Join  (cost=86788.09..87945.00 rows=10387 width=112) (actual
time=19703.000..21154.000 rows=159959 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=1245.50..1246.33 rows=332 width=48) (actual
time=62.000..62.000 rows=441 loops=1)"
"Sort Key: l.locationid"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..1231.60 rows=332 width=48) (actual time=15.000..62.000 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=85542.59..86042.39 rows=199922 width=75) (actual
time=19641.000..19955.000 rows=159960 loops=1)"
"Sort Key: a.locationid"
"->  Merge Right Join  (cost=60850.40..62453.22 rows=199922
width=75) (actual time=13500.000..14734.000 rows=176431 loops=1)"
"  Merge Cond: (("outer".id = "inner".jobtitleid) AND
("outer"."?column4?" = "inner"."?column10?"))"
"  ->  Sort  (cost=554.11..570.13 rows=6409 width=37) (actual
time=94.000..94.000 rows=6391 loops=1)"
"Sort Key: jt.id, (jt.clientnum)::text"
"->  Seq Scan on tbljobtitle jt  (cost=0.00..148.88
rows=6409 width=37) (actual time=0.000..63.000 rows=6391 loops=1)"
"  Filter: (1 = presentationid)"
"  ->  Sort  (cost=60296.29..60796.09 rows=199922 width=53)
(actual time=13406.000..13859.000 rows=176431 loops=1)"
"Sort Key: a.jobtitleid, (a.clientnum)::text"
"->  Seq Scan on tblassociate a  (cost=0.00..38388.79
rows=199922 width=53) (actual time=62.000..10589.000 rows=176431 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 22843.000 ms"

Joel Fradkin
 
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 07, 2005 11:43 AM
To: Joel Fradkin
Cc: 'PostgreSQL Perform'
Subject: Re: [PERFORM] Any way to speed this up? 

"Joel Fradkin" <[EMAIL PROTECTED]> writes:
> random_page_cost = 1.2#4  # units are one sequential page
> fetch cost

That is almost certainly overoptimistic; it's causing the planner to
use indexscans when it shouldn't.  Try 2 or 3 or thereabouts.

regards, tom lane


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Any way to speed this up?

2005-04-07 Thread Joel Fradkin

shared_buffers = 8000   # min 16, at least max_connections*2, 8KB
each
work_mem = 8192#1024# min 64, size in KB
max_fsm_pages = 3   # min max_fsm_relations*16, 6 bytes each
effective_cache_size = 4 #1000  # typically 8KB each
random_page_cost = 1.2#4# units are one sequential page
fetch cost

These are the items I changed.
In the development box I turned random page cost to .2 because I figured it
would all be faster using an index as all my data is at a minimum being
selected by clientnum.

But the analyze I sent in is from these settings above on a windows box.
If I was running the analyze (pgadmin) on a windows box but connecting to a
linux box would the times be accurate or do I have to run the analyze on the
linux box for that to happen?

I am a little unclear why I would need an index on associate by location as
I thought it would be using indexes in location and jobtitle for their
joins.
I did not say where locationid = x in my query on the view.
I have so much to learn about SQL.
Joel




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[PERFORM] Any way to speed this up?

2005-04-07 Thread Joel Fradkin

har(255),

  coidatetime varchar(32),

  ec_note_field varchar(1050),

  locationtypeid int4,

  open_time timestamp,

  close_time timestamp,

  insurance_loc_id varchar(50),

  lpregionid int4,

  sic int4,

  CONSTRAINT pk_tbllocation PRIMARY KEY (clientnum, locationid),

  CONSTRAINT ix_tbllocation_1 UNIQUE (clientnum, locationnum,
name),

  CONSTRAINT ix_tbllocation_unique_number UNIQUE (clientnum,
divisionid, regionid, districtid, locationnum)

)

 

Joel Fradkin

Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-24 Thread Joel Fradkin

I am no expert, but have been asking them a bunch and I think your missing a
key concept.

The data is best on several drives.
I could be completely off, but if I understood (I just finished doing the
same kind of thing minus several databases) you want your WAL on fast drives
in raid 1 and your data (as many drives as you can use) on raid 10 (can be
slower drives , but I saw you already have a bunch of 15k drives).
So you may get best performance just using one database rather then several
smaller ones on mirrored data drives. Keep in mind if you go with ES4 (I am
using AS4) and postgres 8 you can add spindles and move hard hit tables to
their own spindle.

Again I am no expert; just thought I would echo what I was informed.
I ended up using 2 15k drives in raid 1 for my WAL and 4 10k drives for my
data in raid 10. I ended up using links to these from the original install
of postgres on the raid 5, 4 15k drives inside the server itself. I believe
this gives me three separate raid arrays for my install with logs and such
on the raid 5, data on the raid 10 and wal on the raid 1. I am in the
testing and conversion phase and have found it very fast. I used a 4
processor Dell 6550, but think from what I have been told your computer
would have been a better choice (CPU wise). I am not using fibre but do have
a 14 drive powervault which I split to have the 15k's on one side and the
10k's on the other. So I am using both channels of the controller. I have
been told for me to get best performance I should add as many 10k drives to
my data array as I can (but this was all I had in my budget). I have room
for 3 more drives on that side of the powervault.

Best of luck on your project.

Joel


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

45 matches

Mail list logo