Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
1 0 345732 29304 770272 12946764 0 0 16 16428 1192 3105 12 2 85 1 1 0 345732 30840 770060 12945480 0 0 20 16456 1196 3151 12 2 84 1 1 0 345732 32760 769972 12943528 0 0 12 16460 1185 3103 11 2 86 1 >> >> iirc, he is running quad opteron 885 (8 cores), so if my math is >>

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
Maybe it is just the PK *build* that slows it down, but I just tried some small scale experiments on my MacBook Pro laptop (which has the same disk performance as your server) and I get only a 10-15% slowdown from having a PK on an integer column. The 10-15% slowdown was on 8.1.5 MPP, so it used

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
>I'm guessing the high bursts are checkpoints. Can you check your log > >files for pg and see if you are getting warnings about checkpoint > >frequency? You can get some mileage here by increasing wal files. > > Nope, nothing in the log. I have set: > wal_buffers=128 > checkpoint_segments=128

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
> And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks: So, if we divide 32,000 MB by the real time, we get: /data (data): 89 MB/s write 38 MB/s read ... snip ... The read speed on your /data volume is awful to the point where you should consider it broken and find a fix. A quick

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
On 10/27/06, Merlin Moncure <[EMAIL PROTECTED]> wrote: > > r b swpd free buffcache si so bibo in cs us sy id wa > > 1 0 345732 29328 770980 12947212 0 0 20 16552 1223 3677 12 2 85 1 > > 1 0 345732 29840 770520 12946924 0 0 20 29244 1283 2955 11 2 85 1 > > 1 0 345732 32144

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
Worky (that your real name? :-) Nope, its Mike. worky.workerson is just the email that I use for "work" :) How many CPUs on the machine? Can you send the result of "cat /proc/cpuinfo"? Not at work at the moment, however I do have quad dual-core opterons, like Merlin mentioned. Is your "c

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
The read speed on your /data volume is awful to the point where you should consider it broken and find a fix. A quick comparison: the same number on a 16 drive internal SATA array with 7200 RPM disks gets 950 MB/s read, about 25 times faster for about 1/4 the price. I'm hoping that the poor per

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
I do have a dirty little secret, one which I wasn't completely aware of until a little while ago. Apparently, someone decided to install Oracle on the server, and use the SAN as the primary tablespace, so that might have something to do with the poor performance of the SAN. At least, I'm hoping t

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson
Merlin/Luke: > in theory, with 10 10k disks in raid 10, you should be able to keep > your 2fc link saturated all the time unless your i/o is extremely > random. random i/o is the wild card here, ideally you should see at > least 2000 seeks in bonnie...lets see what comes up. I suspect the pr

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson
I'm guessing the high bursts are checkpoints. Can you check your log files for pg and see if you are getting warnings about checkpoint frequency? You can get some mileage here by increasing wal files. Nope, nothing in the log. I have set: wal_buffers=128 checkpoint_segments=128 checkpoint_ti

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson
On 10/25/06, Craig A. James <[EMAIL PROTECTED]> wrote: Jim C. Nasby wrote: > Well, given that perl is using an entire CPU, it sounds like you should > start looking either at ways to remove some of the overhead from perl, > or to split that perl into multiple processes. I use Perl for big databa

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Worky Workerson
http://stats.distributed.net used to use a perl script to do some transformations before loading data into the database. IIRC, when we switched to using C we saw 100x improvement in speed, so I suspect that if you want performance perl isn't the way to go. I think you can compile perl into C, so m

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Worky Workerson
Markus, Could you COPY one of your tables out to disk via psql, and then COPY it back into the database, to reproduce this measurement with your real data? $ psql -c "COPY my_table TO STDOUT" > my_data $ ls my_data 2018792 edgescape_pg_load $ time cat my_data | psql -c "COPY mytable FROM STDIN

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Worky Workerson
> I am most interested in loading two tables, one with about 21 (small) > VARCHARs where each record is about 200 bytes, and another with 7 > INTEGERs, 3 TIMESTAMPs, and 1 BYTEA where each record is about 350 > bytes. indexes/keys? more memory for sorting during index creation can have a dramati

[PERFORM] Best COPY Performance

2006-10-20 Thread Worky Workerson
What is the best COPY performance that you have gotten on a "normal" table? I know that this is question is almost too general, but it might help me out a bit, or at least give me the right things to tweak. Perhaps the question can be rewritten as "Where are the major bottlenecks in a COPY?" or

[PERFORM] Self-join query and index usage

2006-07-14 Thread worky . workerson
I'm doing a self join of some shipping data and wanted to get the best query possible. The interesting table is the event table, and it has the following structure: startnode int, endnode int, weight int, starttime timestamp, endtime timestamp and the query that I would like to run is:

[PERFORM] Bulk loading/merging

2006-05-30 Thread Worky Workerson
I've set up something similar the 'recommended' way to merge data into the DB, i.e. http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING however I did it with a trigger on insert, i.e. (not my schema :) ): CREATE TABLE db (a INT PRIMARY KEY, b TEX

Re: [PERFORM] Bulk loading/merging

2006-05-30 Thread Worky Workerson
Another little question ... would using any sort of TEMP table help out, i.e. loading the unaggregated data into a TEMP table, aggregating the data via a SELECT INTO another TEMP table, and then finally INSERT ... SELECT into the master, aggregated, triggered table?  It seems like this might be a w