Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Scott Carey
On 4/3/09 6:05 PM, "da...@lang.hm" wrote: > On Fri, 3 Apr 2009, Greg Smith wrote: > >> Hannes sent this off-list, presumably via newsgroup, and it's certainly worth >> sharing. I've always been scared off of using XFS because of the problems >> outlined at http://zork.net/~nick/mail/why-reise

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Greg Smith
On Fri, 3 Apr 2009, Tom Lane wrote: However, I don't think anyone else has been pgbench'ing transactions where client-side libpq has to absorb (and then discard) a megabyte of data per xact. I wouldn't be surprised that that eats enough CPU to make it an issue. David, did you pay any attention

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
On Fri, 3 Apr 2009, da...@lang.hm wrote: also note that the message from Ted was back in 2004, there has been a _lot_ of work done on XFS in the last 4 years. Sure, I know they've made progress, which is why I didn't also bring up older ugly problems like delayed allocation issues reducing fi

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread david
On Fri, 3 Apr 2009, Greg Smith wrote: Hannes sent this off-list, presumably via newsgroup, and it's certainly worth sharing. I've always been scared off of using XFS because of the problems outlined at http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc , with more testing showing similar iss

Re: [PERFORM] Using IOZone to simulate DB access patterns

2009-04-03 Thread Josh Berkus
On 4/3/09 4:12 PM, Josh Berkus wrote: All, I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone. Related to this: is IOZone really multi-threaded? I'm doing a test run right now, and only one CPU is actually active. Whi

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
Gah - sorry, setting up pgbouncer for my Plan B. I meant -pgbench- Dave Kerr On Fri, Apr 03, 2009 at 04:34:58PM -0700, David Kerr wrote: - On Fri, Apr 03, 2009 at 06:52:26PM -0400, Tom Lane wrote: - - Greg Smith writes: - - > pgbench is extremely bad at simulating large numbers of clients. Th

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
On Fri, Apr 03, 2009 at 06:52:26PM -0400, Tom Lane wrote: - Greg Smith writes: - > pgbench is extremely bad at simulating large numbers of clients. The - > pgbench client operates as a single thread that handles both parsing the - > input files, sending things to clients, and processing their r

[PERFORM] Using IOZone to simulate DB access patterns

2009-04-03 Thread Josh Berkus
All, I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone. Thing is, IOZone offers a huge complex series of parameters, so I'd really like to have some idea of how to configure it so its results are applicable to databa

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Tom Lane
Greg Smith writes: > pgbench is extremely bad at simulating large numbers of clients. The > pgbench client operates as a single thread that handles both parsing the > input files, sending things to clients, and processing their responses. > It's very easy to end up in a situation where that bo

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Tom Lane
David Kerr writes: > On Fri, Apr 03, 2009 at 04:43:29PM -0400, Tom Lane wrote: > - How much more "real" is the target hardware than what you have? > - You appear to need about a factor of 10 better disk throughput than > - you have, and that's not going to be too cheap. > The hardware i'm using i

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Greg Smith
On Fri, 3 Apr 2009, David Kerr wrote: Here is my transaction file: \setrandom iid 1 5 BEGIN; SELECT content FROM test WHERE item_id = :iid; END; Wrapping a SELECT in a BEGIN/END block is unnecessary, and it will significantly slow down things for two reason: the transactions overhead an

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Scott Marlowe
On Fri, Apr 3, 2009 at 1:53 PM, David Kerr wrote: > Here is my transaction file: > \setrandom iid 1 5 > BEGIN; > SELECT content FROM test WHERE item_id = :iid; > END; > > and then i executed: > pgbench -c 400 -t 50 -f trans.sql -l > > The results actually have surprised me, the database isn't

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
On Fri, Apr 03, 2009 at 04:43:29PM -0400, Tom Lane wrote: - > I'm not really sure how to evaulate the tps, I've read in this forum that - > some folks are getting 2k tps so this wouldn't appear to be good to me. - - Well, you're running a custom transaction definition so comparing your - number to

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Tom Lane
David Kerr writes: > The results actually have surprised me, the database isn't really tuned > and i'm not working on great hardware. But still I'm getting: > caling factor: 1 > number of clients: 400 > number of transactions per client: 50 > number of transactions actually processed: 2/2

[PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
Hello! Sorry for the wall of text here. I'm working on a performance POC and I'm using pgbench and could use some advice. Mostly I want to ensure that my test is valid and that I'm using pgbench properly. The story behind the POC is that my developers want to pull web items from the database (no

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Nathan Boley
> Uh, no, it wouldn't.  Visually: > >        L1      - >        L2      --- >        L3      - > >        R1                     > > At L2, you'd conclude that you're done matching R1. > No, you should conclude that you're done matching

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Simon Riggs writes: > On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote: >> I don't actually believe that a standard merge join algorithm will work >> with an intransitive join condition ... > I think it's a common enough problem that having a non-standard join > algorithm written for that case w

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Alvaro Herrera
Simon Riggs wrote: > > On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote: > > I don't actually believe that a standard merge join algorithm will work > > with an intransitive join condition ... > > I think it's a common enough problem that having a non-standard join > algorithm written for that

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Simon Riggs
On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote: > Matthew Wakeling writes: > > On Fri, 3 Apr 2009, Robert Haas wrote: > >> Why not just use SQL to do the join? > > > Because the merge condition is: > > > WHERE l1.start <= l2.end AND l2.start <= l1.end > > > and merge joins in postgres only

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 11:52 AM, Merlin Moncure wrote: > On Fri, Apr 3, 2009 at 9:17 AM, Matthew Wakeling wrote: >> >> So, I have a view. The query that the view uses can be written two different >> ways, to use two different indexes. Then I use the view in another query, >> under some circumstan

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 9:17 AM, Matthew Wakeling wrote: > > So, I have a view. The query that the view uses can be written two different > ways, to use two different indexes. Then I use the view in another query, > under some circumstances the first way will be quick, and under other > circumstanc

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 11:15 AM, Matthew Wakeling wrote: > On Fri, 3 Apr 2009, Merlin Moncure wrote: >> >> select array(SELECT location FROM location, gene WHERE >> location.subjectid = gene.id ORDER BY objectid, intermine_start, >> intermine_end)) into genes; > > Yeah, that works nicely. > >> thi

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Merlin Moncure wrote: select array(SELECT location FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid, intermine_start, intermine_end)) into genes; Yeah, that works nicely. this will make array of location records. when you access the records to do t

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 11:02 AM, Merlin Moncure wrote: > On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling wrote: >>  genes = '{}'; >>  next_new = 1; >>  FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = >> gene.id ORDER BY objectid, intermine_start, intermine_end LOOP >>

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling wrote: >  genes = '{}'; >  next_new = 1; >  FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = > gene.id ORDER BY objectid, intermine_start, intermine_end LOOP >     genes[next_new] = loc; >     IF (next_new % 1 = 0) THE

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Tom Lane wrote: Intuitively, it seems like 1-D "overlaps" is a tractable enough operator that you should be able to make something merge-like work. But it's more complicated than I think you realize. It's tractable when the two sides are symmetrical, but not so much when t

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling writes: > On Fri, 3 Apr 2009, Tom Lane wrote: >> Not unless you have sorted the inputs in some way that has more >> knowledge than the "equal" operator represents. Otherwise you can have >> elements drop out that might still be needed to match to a later >> left-hand element.

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Matthew Wakeling wrote: On Fri, 3 Apr 2009, Tom Lane wrote: Not unless you have sorted the inputs in some way that has more knowledge than the "equal" operator represents. Otherwise you can have elements drop out that might still be needed to match to a later left-hand elem

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Tom Lane wrote: Not unless you have sorted the inputs in some way that has more knowledge than the "equal" operator represents. Otherwise you can have elements drop out that might still be needed to match to a later left-hand element. Of course. You certainly have to choo

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling writes: > On Fri, 3 Apr 2009, Tom Lane wrote: >> I don't actually believe that a standard merge join algorithm will work >> with an intransitive join condition ... > A standard merge join should work absolutely fine, depending on how it's > implemented. If the implementation kee

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Tom Lane wrote: Oh, hang on, I think I saw something in the docs about what conditions can be used in a merge... No, you got it right the first time. I was about to suggest that maybe you could make it work by recasting the problem as equality on an interval datatype, but t

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling writes: > I have discovered that creating large arrays in plpgql is rather slow. In > fact, it seems to be O(n^2). For variable-width element types, yeah. Don't go that way. > ... alternatively is there a way to read two results streams > simultaneously? Use two cursors and

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling writes: > On Fri, 3 Apr 2009, Robert Haas wrote: >> Why not just use SQL to do the join? > Because the merge condition is: > WHERE l1.start <= l2.end AND l2.start <= l1.end > and merge joins in postgres only currently cope with the case where the > merge condition is an equals

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Robert Haas wrote: On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling wrote: I'm writing a plpgsql function that effectively does a merge join on the results of two queries. Why not just use SQL to do the join? Because the merge condition is: WHERE l1.start <= l2.end AND

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Robert Haas wrote: On the other hand, the query planner should be figuring out which index to use without any help from you. If it's not, something is wrong. Unfortunately it cannot tell that SELECT l1.id AS id1, l2.id AS id2 FROM location l1, location l2 WHERE l1.start <=

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling wrote: > I'm writing a plpgsql function that effectively does a merge join on the > results of two queries. Now, it appears that I cannot read the results of > two queries as streams in plpgsql, so I need to copy the contents of one > query into an

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 9:17 AM, Matthew Wakeling wrote: > So, I have a view. The query that the view uses can be written two different > ways, to use two different indexes. Then I use the view in another query, > under some circumstances the first way will be quick, and under other > circumstances

[PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
I'm writing a plpgsql function that effectively does a merge join on the results of two queries. Now, it appears that I cannot read the results of two queries as streams in plpgsql, so I need to copy the contents of one query into an array first, and then iterate over the second query afterwa

[PERFORM] Rewriting using rules for performance

2009-04-03 Thread Matthew Wakeling
So, I have a view. The query that the view uses can be written two different ways, to use two different indexes. Then I use the view in another query, under some circumstances the first way will be quick, and under other circumstances the second way will be quick. What I want to know is, can

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
On Thu, 2 Apr 2009, Scott Carey wrote: The big one, is this quote from the linux kernel list: " Right now, if you want a reliable database on Linux, you _cannot_ properly depend on fsync() or fdatasync(). Considering how much Linux is used for critical databases, using these functions, this ama

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
On Thu, 2 Apr 2009, James Mansion wrote: Might have to give the whole disk to ZFS with Solaris to give it confidence to enable write cache Confidence, sure, but not necessarily performance at the same time. The ZFS Kool-Aid gets bitter sometimes too, and I worry that its reputation causes p

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
Hannes sent this off-list, presumably via newsgroup, and it's certainly worth sharing. I've always been scared off of using XFS because of the problems outlined at http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc , with more testing showing similar issues at http://pages.cs.wisc.edu/~vshree

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Mark Kirkwood
Mark Kirkwood wrote: Rebuilt with 256K chunksize: transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 24 number of transactions per client: 12000 number of transactions actually processed: 288000/288000 tps = 942.852104 (including connections establishing) tps = 943.019223 (

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Hannes Dorbath
Ron Mayer wrote: Greg Smith wrote: On Wed, 1 Apr 2009, Scott Carey wrote: Write caching on SATA is totally fine. There were some old ATA drives that when paried with some file systems or OS's would not be safe. There are some combinations that have unsafe write barriers. But there is a stan