from:"Mohan, Ross"

Re: [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Mohan, Ross

>From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. 

"...125. Intel has designed its compiler purposely to degrade performance when 
a program
is run on an AMD platform. To achieve this, Intel designed the compiler to 
compile code
along several alternate code paths. Some paths are executed when the program 
runs on an Intel
platform and others are executed when the program is operated on a computer 
with an AMD
microprocessor. (The choice of code path is determined when the program is 
started, using a
feature known as "CPUID" which identifies the computer's microprocessor.) By 
design, the
code paths were not created equally. If the program detects a "Genuine Intel" 
microprocessor,
it executes a fully optimized code path and operates with the maximum 
efficiency. However,
if the program detects an "Authentic AMD" microprocessor, it executes a 
different code path
that will degrade the program's performance or cause it to crash..."


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Trying to figure out pgbench

2005-06-21 Thread Mohan, Ross

I had a similar experience. 

regardless of scaling, etc, I got same results. almost like flags
are not active. 

did

pgbench -I template1
and
pgbench -c 10 -t 50 -v -d 1 

and played around from there

This is on IBM pSeries, AIX5.3, PG8.0.2

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of William Yu
Sent: Tuesday, June 21, 2005 12:05 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Trying to figure out pgbench


My Dual Core Opteron server came in last week. I tried to do some 
benchmarks with pgbench to get some numbers on the difference between 
1x1 -> 2x1 -> 2x2 but no matter what I did, I kept getting the same TPS 
on all systems. Any hints on what the pgbench parameters I should be using?

In terms of production use, it definitely can handle more load. 
Previously, Apache/Perl had to run on a separate server to avoid a ~50% 
penalty. Now, the numbers are +15% performance even with Apache/Perl 
running on the same box as PostgreSQL. How much more load of course is 
what I'd like to quantify.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Whence the Opterons?

2005-06-10 Thread Mohan, Ross

Richard, 

thanks for info. 

"...the RH supplied Postgres binary has issues..."

Would you have the time to provide a bit more info?
Version of PG? Nature of issues? Methods that resolved?

Thanks again, 

-- Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Rowell
Sent: Friday, June 10, 2005 8:34 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Whence the Opterons?

I will second the nod to Penguin computing.  We have a bit of Penguin hardware 
here (though the majority is Dell).  We did have issues with one machine a 
couple of years ago, but Penguin was very pro-active in addressing that.

We recently picked up a Dual Opteron system from them and have been very 
pleased with it so far.  

I would be careful of the RHES that it ships with though.  We had machine 
lockups immediately after the suggested  kernel update (had to down grade 
manually).  Also, the RH supplied Postgres binary has issues, so you would need 
to compile Postgres yourself until the next RH update.

On Fri, 2005-05-06 at 14:39 -0700, Mischa Sandberg wrote:
> After reading the comparisons between Opteron and Xeon processors for 
> Linux, I'd like to add an Opteron box to our stable of Dells and 
> Sparcs, for comparison.
> 
> IBM, Sun and HP have their fairly pricey Opteron systems.
> The IT people are not swell about unsupported purchases off ebay. 
> Anyone care to suggest any other vendors/distributors? Looking for 
> names with national support, so that we can recommend as much to our 
> customers.
> 
> Many thanks in advance.
-- 
--
Richard Rowell
[EMAIL PROTECTED]
Bowman Systems
(318) 213-8780

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[PERFORM] test - pls delete and ignore

2005-05-25 Thread Mohan, Ross

You just couldn't help yourself, could you? :-)


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [PORTS] Which library has these symbols? -- Eureka

2005-05-25 Thread Mohan, Ross

Title: RE: [PORTS] Which library has these symbols? -- Eureka






64-bit PG 8.0.2. is up and running on AIX5.3/power5


YES! ! !


The major thing:  setting some quirky LDFLAGS. 


Anyone interested in details, please ping. 


Thanks to Nick Addington, Vincent Vanwynsberghe, my SA, Sergey, 

and Tom Lane (for good-natured nudging)



My Next Task: Finding a Stress Test Harness to Load, and Query Data. 


Anyone have ideas? 


I am eagerly awaiting the DESTRUCTION of Oracle around here, and

"yes" I am an oracle DBA and think it's very good technology. 


Smiling, 


Ross Mohan


-Original Message-

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Mohan, Ross

Sent: Wednesday, May 25, 2005 1:11 PM

To: [EMAIL PROTECTED]

Subject: Re: [PORTS] Which library has these symbols? 



Tom, 


they're all over the place, repeated in different

libraries, kind of a pain. Didn't realize that. I'll

just give linker a bunch of LIBPATH and LIBNAME directives

and have it run around. 



# ar -t ./postgresql-8.0.2/src/interfaces/ecpg/ecpglib/libecpg.a | egrep 'dirmod|path|pgstr|pgsleep' 

path.o


# ar -t ./postgresql-8.0.2/src/interfaces/ecpg/pgtypeslib/libpgtypes.a | egrep 'dirmod|path|pgstr|pgsleep' pgstrcasecmp.o

# ar -t ./postgresql-8.0.2/src/interfaces/libpq/libpq.a | egrep 'dirmod|path|pgstr|pgsleep'   

pgstrcasecmp.o 


# ar -t ./postgresql-8.0.2/src/port/libpgport.a | egrep 'dirmod|path|pgstr|pgsleep'   

dirmod.o

path.o  

pgsleep.o

pgstrcasecmp.o


# ar -t ./postgresql-8.0.2/src/port/libpgport_srv.a | egrep 'dirmod|path|pgstr|pgsleep' dirmod_srv.o path.o pgsleep.o pgstrcasecmp.o



I **really** want this in 64bit..funny this problem only shows up in 64, not 32 mode.  



Thanks for commenting --- That's ALWAYS welcome!


-- Ross


-Original Message-

From: Tom Lane [mailto:[EMAIL PROTECTED]] 

Sent: Tuesday, May 24, 2005 10:53 PM

To: Mohan, Ross

Cc: [EMAIL PROTECTED]

Subject: Re: [PORTS] Which library has these symbols? 



"Mohan, Ross" <[EMAIL PROTECTED]> writes:

> So Close, Yet So Far!


The specific symbols being complained of should be in libpgport_srv (see src/port).  Dunno why your platform is ignoring that library.  When you find out, let us know ;-)

            regards, tom lane


---(end of broadcast)---

TIP 5: Have you checked our extensive FAQ?


   http://www.postgresql.org/docs/faq

Re: [PERFORM] Prefetch - OffTopic

2005-05-10 Thread Mohan, Ross

Yes, that would be a sufficient (although not necessary) condition for being 
well and fine with kdB. 
Last time I used APL was.pre-Gregorian, so yea, that's scary to me, too. 

( Of course, one can use C/ODBC or Java/JDBC to reach kdB; once there, one uses 
SQL92, or
proprietary kSQL. )



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Browne
Sent: Tuesday, May 10, 2005 4:14 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Prefetch - OffTopic


[EMAIL PROTECTED] ("Mohan, Ross") writes:
> for time-series and "insane fast", nothing beats kdB, I believe
>
> www.kx.com

... Which is well and fine if you're prepared to require that all of the staff 
that interact with data are skilled APL hackers.  Skilled enough that they're 
all ready to leap into Whitney's ASCII-based variant, K.
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org") 
http://www.ntlug.org/~cbbrowne/functional.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror "And he 
must  be taken alive!" The command will be:  ``And try to take him alive if it 
is reasonably practical.''" <http://www.eviloverlord.com/>

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[PERFORM] Prefetch - OffTopic

2005-05-10 Thread Mohan, Ross

for time-series and "insane fast", nothing beats kdB, I believe

www.kx.com

Not trying to Quisling-out PG here, just hoping to respond to Mr. Olson

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
Sent: Tuesday, May 10, 2005 2:54 PM
To: Greg Stark
Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Prefetch 

Greg Stark <[EMAIL PROTECTED]> writes:
> Actually forcing things to use indexes is the wrong direction to go if 
> you're trying to process lots of data and want to stream it off disk 
> as rapidly as possible. I would think about whether you can structure 
> your data such that you can use sequential scans.

Agreed.

> In your application that might be hard. It sounds like you would need 
> more or less one table per stock ticker which would really be hard to 
> manage.

Actually, in a previous lifetime I used to do pretty much the same stuff Matt 
is working on.  The reason I suggested parallelizing is that what you want is 
usually not so much the 200day moving average of FOO, as the 200day moving 
averages of a whole bunch of things.  If your input table contains time-ordered 
data for all those things, then a seqscan works out pretty well.

> One thing you might look into is using the CLUSTER command. But 
> postgres doesn't maintain the cluster ordering so it would require 
> periodically rerunning it.

If the desired sort order is time-based, it falls out pretty much for free in 
this application, because historical data doesn't change -- you are only 
interested in appending at the right.

In said previous lifetime, we used Postgres for tracking our actual 
transactions, but we built a custom file format for storing the individual tick 
data.  That's not stuff you need transactional semantics for; the historical 
data is what it is.  Besides, you need to compress it as much as you can 
because there's always too much of it.  Machines are faster and disk space 
cheaper than they were at the time, but I'd still question the wisdom of using 
a Postgres row for each daily bar, let alone finer-grain data.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Table Partitioning: Will it be supported in Future?

2005-04-26 Thread Mohan, Ross

Maybe he needs to spend $7K on performance improvements? 

;-)



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus
Sent: Tuesday, April 26, 2005 8:00 PM
To: Richard Huxton
Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Table Partitioning: Will it be supported in Future?


Richard,

> I believe these are being worked on at the moment. You might want to 
> search the archives of the hackers mailing list to see if the plans 
> will suit your needs.

Actually, this is being discussed through the Bizgres project: 
www.bizgres.org.

However, I agree that a 1GB table is not in need of partitioning.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross

Joel, thanks. A couple of things jump out there for
me, not a problem for a routine ODBC connection, but
perhaps in the "lotsa stuff" context of your current
explorations, it might be relevant?

I am completely shooting from the hip, here, but...if
it were my goose to cook, I'd be investigating

Session("StringConn") = 
"DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;

|| Protocol? Is this related to version? is the driver wy old?


FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;

||  Fetch great for OLTP, lousy for batch?


Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;

||  what ARE the datatypes and sizes in your particular case? 

Debug=0;

||  a run with debug=1 probably would spit up something interesting

CommLog=0;Optimizer=1;

||  Optimizer? that's a new one on me

Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseServerSidePrepare=0"

||  that's about all I can see, prima facie.  I'll be very curious to know if 
ODBC is
   any part of your performance equation. 


HTH, 

Ross

-Original Message-
From: Joel Fradkin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 10:54 AM
To: Mohan, Ross
Cc: [EMAIL PROTECTED]; PostgreSQL Perform
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon


Here is the connect string I am using.
It could be horrid as I cut it from ODBC program.

Session("StringConn") = 
"DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar
charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us
eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa
rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC
ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS
erverSidePrepare=0"

Joel Fradkin
 

-Original Message-
From: Mohan, Ross [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 9:42 AM
To: [EMAIL PROTECTED]
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely






---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joel Fradkin
Sent: Thursday, April 21, 2005 10:36 AM
To: 'Tom Lane'; 'John A Meinel'
Cc: 'Postgresql Performance'
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

I suspect he's using pgadmin.  
Yup I was, but I did try running on the linux box in psql, but it was running 
to the screen and took forever because of that.

The real issue is returning to my app using ODBC is very slow (Have not tested 
the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of steam but 
been good until this year when we about doubled our demand by adding sears as a 
client).

Using odbc to postgres on some of the views (Josh from Command is having me do 
some very specific testing) is timing out with a 10 minute time limit. These 
are pages that still respond using MSSQL (this is wehere production is using 
the duel proc and the test is using the 4 proc).

I have a tool that hooks to all three databases so I can try it with that and 
see if I get different responses.

Joel

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread Mohan, Ross

Alex et al., 

I wonder if thats something to think about adding to Postgresql? A setting for 
multiblock read count like Oracle (Although 

||  I would think so, yea. GMTA: I was just having this micro-chat with Mr. Jim 
Nasby. 

having said that I believe that Oracle natively caches pages much more 
aggressively that postgresql, which allows the OS to do the file caching).

||  Yea...and it can rely on what is likely a lot more robust and nuanced 
caching algorithm, but...i don't
know enough (read: anything) about PG's to back that comment up. 

Alex Turner
netEconomist

P.S. Oracle changed this with 9i, you can change the Database block size on a 
tablespace by tablespace bassis making it smaller for OLTP tablespaces and 
larger for Warehousing tablespaces (at least I think it's on a tablespace, 
might be on a whole DB).

||Yes, it's tspace level. 

On 4/19/05, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote:
> > Don't you think "optimal stripe width" would be
> > a good question to research the binaries for? I'd
> > think that drives the answer, largely.  (uh oh, pun alert)
> >
> > EG, oracle issues IO requests (this may have changed _just_
> > recently) in 64KB chunks, regardless of what you ask for. So when I 
> > did my striping (many moons ago, when the Earth was young...) I did 
> > it in 128KB widths, and set the oracle "multiblock read count" 
> > according. For oracle, any stripe size under 64KB=stupid, anything 
> > much over 128K/258K=wasteful.
> >
> > I am eager to find out how PG handles all this.
> 
> AFAIK PostgreSQL requests data one database page at a time (normally 
> 8k). Of course the OS might do something different.
> --
> Jim C. Nasby, Database Consultant   [EMAIL PROTECTED]
> Give your computer some brain candy! www.distributed.net Team #1828
> 
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
> 
> ---(end of 
> broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread Mohan, Ross

right, the oracle system uses a second "low latency" bus to 
manage locking information (at the block level) via a
distributed lock manager.  (but this is slightly different
albeit related to a clustered file system and OS-managed
locking, eg) 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dawid Kuroczko
Sent: Wednesday, April 20, 2005 4:56 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

On 4/19/05, Mohan, Ross <[EMAIL PROTECTED]> wrote:
> Clustered file systems is the first/best example that
> comes to mind. Host A and Host B can both request from diskfarm, eg.

Something like a Global File System?

http://www.redhat.com/software/rha/gfs/

(I believe some other company did develop it some time in the past; hmm, 
probably the guys doing LVM stuff?).

Anyway the idea is that two machines have same filesystem mounted and they 
share it. The locking I believe is handled by communication between computers 
using "host to host" SCSI commands.

I never used it, I've only heard about it from a friend who used to work with 
it in CERN.

   Regards,
   Dawid

---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread Mohan, Ross

kewl. 

Well, 8k request out of PG kernel might turn into an "X"Kb request at
disk/OS level, but duly noted. 

Did you scan the code for this, or are you pulling this recollection from
the cognitive archives? :-)



-Original Message-
From: Jim C. Nasby [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 8:12 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


On Mon, Apr 18, 2005 at 06:41:37PM -0000, Mohan, Ross wrote:
> Don't you think "optimal stripe width" would be
> a good question to research the binaries for? I'd
> think that drives the answer, largely.  (uh oh, pun alert)
> 
> EG, oracle issues IO requests (this may have changed _just_
> recently) in 64KB chunks, regardless of what you ask for. 
> So when I did my striping (many moons ago, when the Earth 
> was young...) I did it in 128KB widths, and set the oracle 
> "multiblock read count" according. For oracle, any stripe size
> under 64KB=stupid, anything much over 128K/258K=wasteful. 
> 
> I am eager to find out how PG handles all this.

AFAIK PostgreSQL requests data one database page at a time (normally 8k). Of 
course the OS might do something different.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross

Well, more like they both are allowed to issue disk
requests and the magical "clustered file system" manages
locking, etc. 

In reality, any disk is only reading/writing to one part of
the disk at any given time, of course, but that in the multiple
initiator deal, multiple streams of requests from multiple hosts
can be queued. 



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 12:16 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Mohan, Ross wrote:
> Clustered file systems is the first/best example that
> comes to mind. Host A and Host B can both request from diskfarm, eg.

So one host writes to part of the disk and another host writes to a different 
part?

---

> -Original Message-
> From: Bruce Momjian [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, April 19, 2005 12:10 PM
> To: Mohan, Ross
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to improve db performance with $7K?
> 
> 
> Mohan, Ross wrote:
> > The only part I am pretty sure about is that real-world experience
> > shows SCSI is better for a mixed I/O environment.  Not sure why, 
> > exactly, but the command queueing obviously helps, and I am not sure 
> > what else does.
> > 
> > ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version
> > || of per se drive request reordering)
> >should go a looong way (but not all the way) toward making SATA 
> > 'enterprise acceptable'. Multiple 
> >initiators (e.g. more than one host being able to talk to a drive) is a 
> > biggie, too. AFAIK only SCSI
> >drives/controllers do that for now.
> 
> What is 'multiple initiators' used for in the real world?
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
> 
> ---(end of 
> broadcast)---
> TIP 6: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross

Clustered file systems is the first/best example that
comes to mind. Host A and Host B can both request from diskfarm, eg. 



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 12:10 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Mohan, Ross wrote:
> The only part I am pretty sure about is that real-world experience 
> shows SCSI is better for a mixed I/O environment.  Not sure why, 
> exactly, but the command queueing obviously helps, and I am not sure 
> what else does.
> 
> ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version 
> || of per se drive request reordering)
>should go a looong way (but not all the way) toward making SATA 
> 'enterprise acceptable'. Multiple 
>initiators (e.g. more than one host being able to talk to a drive) is a 
> biggie, too. AFAIK only SCSI
>drives/controllers do that for now.

What is 'multiple initiators' used for in the real world?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross

Good question.  If the SCSI system was moving the head from track 1 to 10, and 
a request then came in for track 5, could the system make the head stop at 
track 5 on its way to track 10?  That is something that only the controller 
could do.  However, I have no idea if SCSI does that.

||  SCSI, AFAIK, does NOT do this. What SCSI can do is allow "next" request 
insertion into head
of request queue (queue-jumping), and/or defer request ordering to done by 
drive per se (queue
re-ordering).   I  have looked, in vain, for evidence that SCSI somehow 
magically "stops in the
middle of request to pick up data" (my words, not yours) 

The only part I am pretty sure about is that real-world experience shows SCSI 
is better for a mixed I/O environment.  Not sure why, exactly, but the command 
queueing obviously helps, and I am not sure what else does.

||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per se 
drive request reordering) 
   should go a looong way (but not all the way) toward making SATA 'enterprise 
acceptable'. Multiple 
   initiators (e.g. more than one host being able to talk to a drive) is a 
biggie, too. AFAIK only SCSI
   drives/controllers do that for now. 




-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Mohan, Ross

Don't you think "optimal stripe width" would be
a good question to research the binaries for? I'd
think that drives the answer, largely.  (uh oh, pun alert)

EG, oracle issues IO requests (this may have changed _just_ 
recently) in 64KB chunks, regardless of what you ask for. 
So when I did my striping (many moons ago, when the Earth 
was young...) I did it in 128KB widths, and set the oracle 
"multiblock read count" according. For oracle, any stripe size
under 64KB=stupid, anything much over 128K/258K=wasteful. 

I am eager to find out how PG handles all this. 

- Ross

p.s.  'You want a database record? I 
  gotcher record right here' http://en.wikipedia.org/wiki/Akashic_Records

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Turner
Sent: Monday, April 18, 2005 2:21 PM
To: Jacques Caron
Cc: Greg Stark; William Yu; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

So I wonder if one could take this stripe size thing further and say that a 
larger stripe size is more likely to result in requests getting served 
parallized across disks which would lead to increased performance?

Again, thanks to all people on this list, I know that I have learnt a _hell_ of 
alot since subscribing.

Alex Turner
netEconomist

On 4/18/05, Alex Turner <[EMAIL PROTECTED]> wrote:
> Ok - well - I am partially wrong...
> 
> If you're stripe size is 64Kb, and you are reading 256k worth of data, 
> it will be spread across four drives, so you will need to read from 
> four devices to get your 256k of data (RAID 0 or 5 or 10), but if you 
> are only reading 64kb of data, I guess you would only need to read 
> from one disk.
> 
> So my assertion that adding more drives doesn't help is pretty 
> wrong... particularly with OLTP because it's always dealing with 
> blocks that are smaller that the stripe size.
> 
> Alex Turner
> netEconomist
> 
> On 4/18/05, Jacques Caron <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > At 18:56 18/04/2005, Alex Turner wrote:
> > >All drives are required to fill every request in all RAID levels
> >
> > No, this is definitely wrong. In many cases, most drives don't 
> > actually have the data requested, how could they handle the request?
> >
> > When reading one random sector, only *one* drive out of N is ever 
> > used to service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.
> >
> > When writing:
> > - in RAID 0, 1 drive
> > - in RAID 1, RAID 0+1 or 1+0, 2 drives
> > - in RAID 5, you need to read on all drives and write on 2.
> >
> > Otherwise, what would be the point of RAID 0, 0+1 or 1+0?
> >
> > Jacques.
> >
> >
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Mohan, Ross

Greg, et al. 

I never found any evidence of a "stop and get an intermediate request"
functionality in the TCQ protocol. 

IIRC, what is there is

1) Ordered
2) Head First
3) Simple

implemented as choices. *VERY* roughly, that'd be like
(1) disk subsystem satisfies requests as submitted, (2) let's
the "this" request be put at the very head of the per se disk
queue after the currently-running disk request is complete, and
(3) is "let the per se disk and it's software reorder the requests
on-hand as per it's onboard software".  (N.B. in the last, it's
the DISK not the controller making those decisions). (N.B. too, that
this last is essentially what NCQ (cf. TCQ) is doing )

I know we've been batting around a hypothetical case of SCSI
where it "stops and gets smth. on the way", but I can find
no proof (yet) that this is done, pro forma, by SCSI drives. 

In other words, SCSI is a necessary, but not sufficient cause
for intermediate reading. 

FWIW

- Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Friday, April 15, 2005 2:02 PM
To: Tom Lane
Cc: Kevin Brown; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Tom Lane <[EMAIL PROTECTED]> writes:

> Yes, you can probably assume that blocks with far-apart numbers are 
> going to require a big seek, and you might even be right in supposing 
> that a block with an intermediate number should be read on the way. 
> But you have no hope at all of making the right decisions at a more 
> local level --- say, reading various sectors within the same cylinder 
> in an optimal fashion.  You don't know where the track boundaries are, 
> so you can't schedule in a way that minimizes rotational latency. 
> You're best off to throw all the requests at the drive together and 
> let the drive sort it out.

Consider for example three reads, one at the beginning of the disk, one at the 
very end, and one in the middle. If the three are performed in the logical 
order (assuming the head starts at the beginning), then the drive has to seek, 
say, 4ms to get to the middle and 4ms to get to the end.

But if the middle block requires a full rotation to reach it from when the head 
arrives that adds another 8ms of rotational delay (assuming a 7200RPM drive).

Whereas the drive could have seeked over to the last block, then seeked back in 
8ms and gotten there just in time to perform the read for free.

I'm not entirely convinced this explains all of the SCSI drives' superior 
performance though. The above is about a worst-case scenario. should really 
only have a small effect, and it's not like the drive firmware can really 
schedule things perfectly either.

I think most of the difference is that the drive manufacturers just don't 
package their high end drives with ATA interfaces. So there are no 10k RPM ATA 
drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but 
most of the manufacturers aren't even doing that.

-- 
greg

---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[PERFORM] Spend 7K WHERE? WAS Intel SRCS16 SATA raid? and How to Improve w/7K$?

2005-04-15 Thread Mohan, Ross


Sorry to blend threads, but in my kinda longish, somewhat thankless, 
essentially anonymous, and quite average career as a dba, I have 
found that the 7K would be best spent on a definitive end-to-end
"application critical path" test (pretty easy to instrument apps
and lash on test harnesses these days). 


If it's "the disk subsystem", then by all means, spend the 7K there. 

If the "7K$" is for "hardware only", then disk is always a good choice. For
a really small shop, maybe it's an upgrade to a dual CPU opteron MOBO, eg. 
dunno.

If, however, in the far-more-likely case that the application code
or system/business process is the throttle point, it'd be a great
use of money to have a test report showing that to the "higher ups". 
That's where the best scalability bang-for-buck can be made. 


- Ross

p.s. having said this, and as already been noted "7K" ain't
 going to buy that muchmaybe the ability to go RAID 10?
 
p.p.s  Why don't we start a PGSQL-7K listserv, to handle this EPIC thread? :-)


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Friday, April 15, 2005 11:20 AM
To: Alex Turner
Cc: Dave Held; pgsql-performance@postgresql.org; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Intel SRCS16 SATA raid?


This is a different thread that the $7k server thread.
Greg Stark started it and wrote:

 "I'm also wondering about whether I'm better off with one of these 
 SATA raid  
 controllers or just going with SCSI drives."   



Rick

[EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM:

> The original thread was how much can I get for $7k
>
> You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are ona
budget!
>
> 10k RPM SATA drives give acceptable performance at a good price, thats 
> really the point here.
>
> I have never really argued that SATA is going to match SCSI 
> performance on multidrive arrays for IO/sec.  But it's all about the 
> benjamins baby.  If I told my boss we need $25k for a database 
> machine, he'd tell me that was impossible, and I have $5k to do it. If 
> I tell him $7k - he will swallow that.  We don't _need_ the amazing 
> performance of a 15k RPM drive config.  Our biggest hit is reads, so 
> we can buy 3xSATA machines and load balance.  It's all about the 
> application, and buying what is appropriate.  I don't buy a Corvette 
> if all I need is a malibu.
>
> Alex Turner
> netEconomist
>
> On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote:
> > > -Original Message-
> > > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > > Sent: Thursday, April 14, 2005 6:15 PM
> > > To: Dave Held
> > > Cc: pgsql-performance@postgresql.org
> > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> > >
> > > Looking at the numbers, the raptor with TCQ enabled was close or 
> > > beat the Atlas III 10k drive on most benchmarks.
> >
> > And I would be willing to bet that the Atlas 10k is not using the 
> > same generation of technology as the Raptors.
> >
> > > Naturaly a 15k drive is going to be faster in many areas, but it 
> > > is also much more expensive.  It was only 44% better on the server 
> > > tests than the raptor with TCQ, but it costs nearly 300% more 
> > > ($538 cdw.com, $180 newegg.com).
> >
> > State that in terms of cars.  Would you be willing to pay 300% more 
> > for a car that is 44% faster than your competitor's?  Of course you 
> > would, because we all recognize that the cost of speed/performance 
> > does not scale linearly.  Naturally, you buy the best speed that you 
> > can afford, but when it comes to hard drives, the only major feature 
> > whose price tends to scale anywhere close to linearly is capacity.
> >
> > > Note also that the 15k drive was the only drive that kept up with 
> > > the raptor on raw transfer speed, which is going to matter for 
> > > WAL.
> >
> > So get a Raptor for your WAL partition. ;)
> >
> > > [...]
> > > The Raptor drives can be had for as little as $180/ea, which is 
> > > quite a good price point considering they can keep up with their 
> > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled 
> > > (Note that 3ware controllers _don't_ support NCQ, although they 
> > > claim their HBA based queueing is 95% as good as NCQ on the 
> > > drive).
> >
> > Just keep in mind the points made by the Seagate article.  You're 
> > buying much more than just performance for that $500+.  You're also 
> > buying vibrational tolerance, high MTBF, better internal 
> > environmental controls, and a pretty significant margin on seek 
> > time, which is probably your most important feature for disks 
> > storing tables. An interesting test would be to stick several drives 
> > in a cabi

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Mohan, Ross

I've been doing some reading up on this, trying to keep up here, 
and have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a "Phase Zero"
implementation of Tagged Command Queueing (the special sauce
for SCSI).
2) This SATA "TCQ" is called NCQ and I believe it basically
allows the disk software itself to do the reordering
(this is called "simple" in TCQ terminology) It does not
yet allow the TCQ "head of queue" command, allowing the
current tagged request to go to head of queue, which is
a simple way of manifesting a "high priority" request.

3) SATA drives are not yet multi-initiator?

Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Kevin Brown <[EMAIL PROTECTED]> writes:

> Greg Stark wrote:
> 
> 
> > I think you're being misled by analyzing the write case.
> > 
> > Consider the read case. When a user process requests a block and 
> > that read makes its way down to the driver level, the driver can't 
> > just put it aside and wait until it's convenient. It has to go ahead 
> > and issue the read right away.
> 
> Well, strictly speaking it doesn't *have* to.  It could delay for a 
> couple of milliseconds to see if other requests come in, and then 
> issue the read if none do.  If there are already other requests being 
> fulfilled, then it'll schedule the request in question just like the 
> rest.

But then the cure is worse than the disease. You're basically describing 
exactly what does happen anyways, only you're delaying more requests than 
necessary. That intervening time isn't really idle, it's filled with all the 
requests that were delayed during the previous large seek...

> Once the first request has been fulfilled, the driver can now schedule 
> the rest of the queued-up requests in disk-layout order.
> 
> I really don't see how this is any different between a system that has 
> tagged queueing to the disks and one that doesn't.  The only 
> difference is where the queueing happens.

And *when* it happens. Instead of being able to issue requests while a large 
seek is happening and having some of them satisfied they have to wait until 
that seek is finished and get acted on during the next large seek.

If my theory is correct then I would expect bandwidth to be essentially 
equivalent but the latency on SATA drives to be increased by about 50% of the 
average seek time. Ie, while a busy SCSI drive can satisfy most requests in 
about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that 
that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such 
IDE/SATA drives exist...)

In reality higher latency feeds into a system feedback loop causing your 
application to run slower causing bandwidth demands to be lower as well. It's 
often hard to distinguish root causes from symptoms when optimizing complex 
systems.

-- 
greg

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Mohan, Ross

sorry, don't remember whether it's SCSI or SATA II, but IIRC
the Areca controllers are just stellar for things. 

If you do get SATA for db stuff..especially multiuser...i still
haven't seen anything to indicate an across-the-board primacy
for SATA over SCSI. I'd go w/SCSI, or if SATA for $$$ reasons, I'd
be sure to have many spindles and RAID 10. 

my 0.02. I'm surely not an expert of any kind. 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 10:55 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Intel SRCS16 SATA raid?



Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead 
of the 3ware one.

Poking around it seems this does come with Linux drivers and there is a battery 
backup option. So it doesn't seem to be completely insane.

Anyone have any experience with these controllers?

I'm also wondering about whether I'm better off with one of these SATA raid 
controllers or just going with SCSI drives.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Mohan, Ross

Imagine a system in "furious activity" with two (2) process regularly occuring

Process One:  Long read (or write). Takes 20ms to do seek, latency, and 
stream off. Runs over and over. 
Process Two:  Single block read ( or write ). Typical database row access. 
Optimally, could be submillisecond. happens more or less 
randomly. 

Let's say process one starts, and then process two. Assume, for sake of this 
discussion, 
that P2's block lies w/in P1's swath. (But doesn't have to...)

Now, everytime process two has to wait at LEAST 20ms to complete. In a 
queue-reordering
system, it could be a lot faster. And me, looking for disk service times on P2, 
keep
wondering "why does a single diskblock read keep taking >20ms?"

Sit doesn't need to be "a read" or "a write". It doesn't need to be 
"furious activity"
(two processes is not furious, even for a single user desktop.)  This is not a 
"corner case", 
and while it doesn't take into account kernel/drivecache/UBC buffering issues, 
I think it
shines a light on why command re-ordering might be useful.  

YMMV. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin Brown
Sent: Thursday, April 14, 2005 4:36 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Greg Stark wrote:

> I think you're being misled by analyzing the write case.
> 
> Consider the read case. When a user process requests a block and that 
> read makes its way down to the driver level, the driver can't just put 
> it aside and wait until it's convenient. It has to go ahead and issue 
> the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a couple of 
milliseconds to see if other requests come in, and then issue the read if none 
do.  If there are already other requests being fulfilled, then it'll schedule 
the request in question just like the rest.

> In the 10ms or so that it takes to seek to perform that read
> *nothing* gets done. If the driver receives more read or write 
> requests it just has to sit on them and wait. 10ms is a lifetime for a 
> computer. In that time dozens of other processes could have been 
> scheduled and issued reads of their own.

This is true, but now you're talking about a situation where the system goes 
from an essentially idle state to one of furious activity.  In other words, 
it's a corner case that I strongly suspect isn't typical in situations where 
SCSI has historically made a big difference.

Once the first request has been fulfilled, the driver can now schedule the rest 
of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that has tagged 
queueing to the disks and one that doesn't.  The only difference is where the 
queueing happens.  In the case of SCSI, the queueing happens on the disks (or 
at least on the controller).  In the case of SATA, the queueing happens in the 
kernel.

I suppose the tagged queueing setup could begin the head movement and, if 
another request comes in that requests a block on a cylinder between where the 
head currently is and where it's going, go ahead and read the block in 
question.  But is that *really* what happens in a tagged queueing system?  It's 
the only major advantage I can see it having.

> The same thing would happen if you had lots of processes issuing lots 
> of small fsynced writes all over the place. Postgres doesn't really do 
> that though. It sort of does with the WAL logs, but that shouldn't 
> cause a lot of seeking.  Perhaps it would mean that having your WAL 
> share a spindle with other parts of the OS would have a bigger penalty 
> on IDE drives than on SCSI drives though?

Perhaps.

But I rather doubt that has to be a huge penalty, if any.  When a process 
issues an fsync (or even a sync), the kernel doesn't *have* to drop everything 
it's doing and get to work on it immediately.  It could easily gather a few 
more requests, bundle them up, and then issue them.  If there's a lot of disk 
activity, it's probably smart to do just that.  All fsync and sync require is 
that the caller block until the data hits the disk (from the point of view of 
the kernel). The specification doesn't require that the kernel act on the calls 
immediately or write only the blocks referred to by the call in question.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[PERFORM] Building postmaster with Profiling Support WAS "Tweaking a C Function I wrote"

2005-04-07 Thread Mohan, Ross

Adam - 

Is compiling postmaster with profiling support just a flag
in the build/make? Or is there something more involved? 

I'd like to be able to do this in the future and so am
curious about means/methods. 

If this is a RTFM, just let me know that (am currently 
Reading The F Manual), but if you have any "special sauce"
here, that'd be of great interest. 

Thanks

-Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Palmblad
Sent: Wednesday, April 06, 2005 7:23 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Tweaking a C Function I wrote


I wanted to see if I could squeeze any more performance out of a C set 
returning function I wrote.  As such, I looked to a profiler.  Is it 
possible to get profile information on the function I wrote?  I've got 
postmaster and my function compiled with profiling support, and can find 
the gmon.out files... can I actually look at the call tree that occurs 
when my function is being executed or will I be limited to viewing calls 
to functions in the postmaster binary?

-Adam


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ?

2005-04-06 Thread Mohan, Ross



31Million tuples were loaded in approx 279 seconds, or approx 112k rows per 
second.

> I'd love to see PG get into this range..i am a big fan of PG (just a 
> rank newbie) but I gotta think the underlying code to do this has to 
> be not-too-complex.

I'd say we're there.


||Yes! PG is there, assuredly!   So VERY cool!  I made a newbie
error of conflating COPY with INSERT. I don't know if I could get
oracle to do much more than about 500-1500 rows/sec...PG is quite 
impressive.

Makes one wonder why corporations positively insist on giving oracle
 yearly. 

-Original Message-
From: Rod Taylor [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 06, 2005 12:41 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ?


On Wed, 2005-04-06 at 16:12 +, Mohan, Ross wrote:
> I wish I had a Dell system and run case to show you Alex, but I 
> don't... however...using Oracle's "direct path" feature, it's pretty 
> straightforward.
> 
> We've done 110,000 rows per second into index-less tables on a big 
> system (IBM Power5 chips, Hitachi SAN). ( Yes, I am sure: over 100K a 
> second. Sustained for almost 9 minutes. )

Just for kicks I did a local test on a desktop machine (single CPU, single IDE 
drive) using COPY from STDIN for a set of integers in via a single transaction, 
no indexes.

1572864 tuples were loaded in 13715.613ms, which is approx 115k rows per second.

Okay, no checkpoints and I didn't cross an index boundary, but I also haven't 
tuned the config file beyond bumping up the buffers.

Lets try again with more data this time.

31Million tuples were loaded in approx 279 seconds, or approx 112k rows per 
second.

> I'd love to see PG get into this range..i am a big fan of PG (just a 
> rank newbie) but I gotta think the underlying code to do this has to 
> be not-too-complex.

I'd say we're there.

> -Original Message-
> From: Alex Turner [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, April 06, 2005 11:38 AM
> To: [EMAIL PROTECTED]
> Cc: pgsql-performance@postgresql.org; Mohan, Ross
> Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?
> 
> 
> I think everyone was scared off by the 5000 inserts per second number.
> 
> I've never seen even Oracle do this on a top end Dell system with 
> copious SCSI attached storage.
> 
> Alex Turner
> netEconomist
> 
> On Apr 6, 2005 3:17 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> >  
> > Unfortunately.
> >  
> > But we are in the the process to choose Postgresql with pgcluster. 
> > I'm
> > currently running some tests (performance, stability...) Save the 
> > money on the license fees, you get it for your hardware ;-)
> >  
> > I still welcome any advices or comments and I'll let you know how 
> > the
> > project is going on.
> >  
> > Benjamin.
> >  
> >  
> >  
> >  "Mohan, Ross" <[EMAIL PROTECTED]>
> > 
> > 05/04/2005 20:48
> >  
> > Pour :<[EMAIL PROTECTED]> 
> > cc : 
> > Objet :RE: [PERFORM] Postgresql vs SQLserver for this
> > application ?
> >  
> >  
> > You never got answers on this? Apologies, I don't have one, but'd be
> > curious to hear about any you did get
> >   
> > thx
> >   
> > Ross
> > 
> > -Original Message-
> >  From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of 
> > [EMAIL PROTECTED]
> >  Sent: Monday, April 04, 2005 4:02 AM
> >  To: pgsql-performance@postgresql.org
> >  Subject: [PERFORM] Postgresql vs SQLserver for this application ?
> >  
> > 
> >  hi all.
> >  
> >  We are designing a quite big application that requires a
> > high-performance database backend.  The rates we need to obtain are at 
> > least  5000 inserts per second and 15 selects per second for one 
> > connection. There should only be 3 or 4 simultaneous connections.
> >  I think our main concern is to deal with the constant flow of data coming
> > from the inserts that must be available for selection as fast as possible.
> > (kind of real time access ...) 
> >  
> >  As a consequence, the database should rapidly increase up to more
> > than one hundred gigs. We still have to determine how and when we 
> > shoud backup old data to prevent the application from a performance 
> > drop. We intend to develop some kind of real-time partionning on our 
> > main table keep the flows up.
&g

Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?

2005-04-06 Thread Mohan, Ross

How close to this is PG's COPY?  I get surprisingly good results using COPY 
with jdbc on smallish systems (now if that patch would make into the mainstream 
PG jdbc support!)  I think COPY has a bit more overhead than what a Bulkload 
feature may have, but I suspect it's not that much more.

||  Steve, I do not know. But am reading the docs now, and should figure it 
out. Ask
me later if you remember. Oracle's "direct path" is a way of just slamming 
blocks
filled with rows into the table, above the high water mark. It sidesteps 
freelist
management and all manner of intrablock issues. There is a "payback", but 
the benefits
far far outweigh the costs. 

> Now...if you ask me "can this work without Power5 and Hitachi SAN?" my 
> answer is..you give me a top end Dell and SCSI III on 15K disks and 
> I'll likely easily match it, yea.
> 
> I'd love to see PG get into this range..i am a big fan of PG (just a 
> rank newbie) but I gotta think the underlying code to do this has to 
> be not-too-complex.

It may not be that far off if you can use COPY instead of INSERT. But comparing 
Bulkload to INSERT is a bit apples<->orangish.

||  Oh! I see! I had no idea I was doing that!  Thanks for pointing it out 
clearly to me. Yea, I would
say a full transactional INSERT of 5K rows/sec into an indexed-table is a 
near-mythology without significant
caveats (parallelized, deferred buffering, etc.) 



-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?

2005-04-06 Thread Mohan, Ross

I wish I had a Dell system and run case to show you Alex, but I don't...
however...using Oracle's "direct path" feature, it's pretty straightforward. 

We've done 110,000 rows per second into index-less tables on a big system
(IBM Power5 chips, Hitachi SAN). ( Yes, I am sure: over 100K a second. Sustained
for almost 9 minutes. )

Yes, this is an exception, but oracle directpath/InsertAppend/BulkLoad
feature enabled us to migrate a 4 TB database...really quickly. 

Now...if you ask me "can this work without Power5 and Hitachi SAN?"
my answer is..you give me a top end Dell and SCSI III on 15K disks
and I'll likely easily match it, yea.

I'd love to see PG get into this range..i am a big fan of PG (just a
rank newbie) but I gotta think the underlying code to do this has
to be not-too-complex.

Best, 

Ross



-Original Message-
From: Alex Turner [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 06, 2005 11:38 AM
To: [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org; Mohan, Ross
Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?


I think everyone was scared off by the 5000 inserts per second number.

I've never seen even Oracle do this on a top end Dell system with copious SCSI 
attached storage.

Alex Turner
netEconomist

On Apr 6, 2005 3:17 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>  
> Unfortunately.
>  
> But we are in the the process to choose Postgresql with pgcluster. I'm 
> currently running some tests (performance, stability...) Save the 
> money on the license fees, you get it for your hardware ;-)
>  
> I still welcome any advices or comments and I'll let you know how the 
> project is going on.
>  
> Benjamin.
>  
>  
>  
>  "Mohan, Ross" <[EMAIL PROTECTED]>
> 
> 05/04/2005 20:48
>  
> Pour :<[EMAIL PROTECTED]> 
> cc : 
> Objet :RE: [PERFORM] Postgresql vs SQLserver for this
> application ?
>  
>  
> You never got answers on this? Apologies, I don't have one, but'd be 
> curious to hear about any you did get
>   
> thx
>   
> Ross
> 
> -Original Message-
>  From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf
> Of [EMAIL PROTECTED]
>  Sent: Monday, April 04, 2005 4:02 AM
>  To: pgsql-performance@postgresql.org
>  Subject: [PERFORM] Postgresql vs SQLserver for this application ?
>  
> 
>  hi all.
>  
>  We are designing a quite big application that requires a 
> high-performance database backend.  The rates we need to obtain are at 
> least  5000 inserts per second and 15 selects per second for one 
> connection. There should only be 3 or 4 simultaneous connections.
>  I think our main concern is to deal with the constant flow of data coming
> from the inserts that must be available for selection as fast as possible.
> (kind of real time access ...) 
>  
>  As a consequence, the database should rapidly increase up to more 
> than one hundred gigs. We still have to determine how and when we 
> shoud backup old data to prevent the application from a performance 
> drop. We intend to develop some kind of real-time partionning on our 
> main table keep the flows up.
>  
>  At first, we were planning to use SQL Server as it has features that 
> in my opinion could help us a lot :
> - replication 
> - clustering
>  
>  Recently we started to study Postgresql as a solution for our project : 
> - it also has replication 
> - Postgis module can handle geographic datatypes (which would 
> facilitate our developments)
> - We do have a strong knowledge on Postgresql administration 
> (we use it for production processes)
> - it is free (!) and we could save money for hardware 
> purchase.
>  
>  Is SQL server clustering a real asset ? How reliable are Postgresql 
> replication tools  ? Should I trust Postgresql performance for this 
> kind of needs ?
>  
>  My question is a bit fuzzy but any advices are most welcome... 
> hardware,tuning or design tips as well :))
>  
>  Thanks a lot.
>  
>  Benjamin.
>  
>  
>

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Reading recommendations

2005-03-30 Thread Mohan, Ross

I can see that PG'ers have a wicked sense of humor. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Wampler
Sent: Wednesday, March 30, 2005 3:58 PM
To: [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Reading recommendations


[EMAIL PROTECTED] wrote:

>>Mohan, Ross wrote:
>>
>>>VOIP over BitTorrent?
>>
>>Now *that* I want to see.  Aught to be at least as interesting as the 
>>"TCP/IP over carrier pigeon" experiment - and more challenging to 
>>boot!
>>
> 
> 
> It was very challenging.  I worked on the credit window sizing and 
> retransmission timer estimation algorithms.  We took into account 
> weather patterns, size and age of the bird, feeding times, and the 
> average number of times a bird circles before determining magnetic 
> north.  Interestingly, packet size had little effect in the final 
> algorithms.
> 
> I would love to share them with all of you, but they're classified.

Ah, but VOIPOBT requires many people all saying the same thing at the same 
time.  The synchronization alone (since you need to distribute these people 
adequately to avoid overloading a trunk line...) is probably sufficiently hard 
to make it interesting.  Then there are the problems of different accents, 
dilects, and languages ;)

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Reading recommendations

2005-03-30 Thread Mohan, Ross

Yea, the upside is that you get better than the 1 byte/hour rate for 
pigeon-net. 

Downside is that simply because you use BiTorrent, the RIAA accuses you of 
everything from CD piracy to shipping pr*n to cyberterrorism, and you spend 
the next four years in Gitmo, comparing notes with your cellmates in Camp 
X-Ray, 
and watching pigeons fly overhead. 


-Original Message-
From: Steve Wampler [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 30, 2005 11:52 AM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Reading recommendations


Mohan, Ross wrote:
> VOIP over BitTorrent?

Now *that* I want to see.  Aught to be at least as interesting as the "TCP/IP 
over carrier pigeon" experiment - and more challenging to boot!


-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Reading recommendations

2005-03-30 Thread Mohan, Ross

VOIP over BitTorrent?  

;-)


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, March 30, 2005 11:27 AM
To: Michael Fuhr
Cc: Marc Burgauer; pgsql-performance@postgresql.org; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Reading recommendations




[EMAIL PROTECTED] wrote on 03/30/2005 10:58:21 AM:

>
> Allow telecommute from across the pond and I might be interested :-)

Please post phone bills to this list.

>
> --
> Michael Fuhr

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Projecting currentdb to more users

Re: [PERFORM] Trying to figure out pgbench

Re: [PERFORM] Whence the Opterons?

[PERFORM] test - pls delete and ignore

Re: [PERFORM] [PORTS] Which library has these symbols? -- Eureka

Re: [PERFORM] Prefetch - OffTopic

[PERFORM] Prefetch - OffTopic

Re: [PERFORM] Table Partitioning: Will it be supported in Future?

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] How to improve db performance with $7K?

[PERFORM] Spend 7K WHERE? WAS Intel SRCS16 SATA raid? and How to Improve w/7K$?

Re: [PERFORM] How to improve db performance with $7K?

Re: [PERFORM] Intel SRCS16 SATA raid?

Re: [PERFORM] How to improve db performance with $7K?

[PERFORM] Building postmaster with Profiling Support WAS "Tweaking a C Function I wrote"

Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ?

Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?

Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?

Re: [PERFORM] Reading recommendations

Re: [PERFORM] Reading recommendations

Re: [PERFORM] Reading recommendations

29 matches

Site Navigation

Mail list logo

Footer information