Re: [HACKERS] Improving speed of copy

2002-10-06 Thread Curt Sampson

On Fri, 20 Sep 2002, Shridhar Daithankar wrote:

 On 20 Sep 2002 at 21:22, Shridhar Daithankar wrote:

  Mysql takes 221 sec. v/s 1121 sec. for postgres. For postgresql,
  that is around 11.5K rows per second. Each tuple has 23 fields with
  fixed length of around 100 bytes

Yes, postgres is much slower than MySQL for doing bulk loading of data.
There's not much, short of hacking on the code, that can be done about
this.

 One more issue is time taken for composite index creation. It's 4341
 sec as opposed to 436 sec for mysql. These are three non-unique
 character fields where the combination itself is not unique as well.

Setting sort_mem appropriately makes a big difference here. I generally
bump it up to 2-8 MB for everyone, and when I'm building a big index, I
set it to 32 MB or so just for that session.

But make sure you don't set it so high you drive your system into
swapping, or it will kill your performance. Remember also, that in
7.2.x, postgres will actually use almost three times the value you give
sort_mem (i.e., sort_mem of 32 MB will actually allocate close to 96 MB
of memory for the sort).

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Improving speed of copy

2002-10-02 Thread Mark Kirkwood

Have you tried this with Oracle or similar commercial database?


I have timed COPY/LOAD times for Postgresql/Mysql/Oracle/Db2 -

the rough comparison is :

Db2 and Mysql fastest (Db2 slightly faster)
Oracle approx twice as slow as Db2
Postgresql about 3.5-4 times slower than Db2

However Postgresql can sometimes create indexes faster than Mysql  
so that the total time of COPY + CREATE INDEX can be smaller for 
Postgresql than Mysql.

Oracle an Db2 seemed similarish to Postgresql with respect to CREATE INDEX


regards

Mark



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Improving speed of copy

2002-09-23 Thread Shridhar Daithankar

On 20 Sep 2002 at 18:41, Nigel J. Andrews wrote:

 On Fri, 20 Sep 2002, Shridhar Daithankar wrote:
 
  In select test where approx. 15 rows where reported with query on index field, 
  mysql took 14 sec. and psotgresql took 17.5 sec. Not bad but other issues 
  eclipse the result..
 
 I don't know about anyone else but I find this aspect strange. That's 1 second
 (approx.) per row retrieved. That is pretty dire for an index scan. The
 data/index must be very non unique.

Sorry for late reply.. The numbers were scaled off.. Actually my fiend forgot 
to add units to those number.. The actual numbers are 140ms for mysql and 17
5ms for postgresql.. Further since result are obtained via 'time psql' higher 
overhead of postgres connection establishement is factored in..

Neck to neck I would say..

Bye
 Shridhar

--
Steele's Law:   There exist tasks which cannot be done by more than ten men or 
fewer than one hundred.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Improving speed of copy

2002-09-21 Thread Shridhar Daithankar

On 20 Sep 2002 at 10:27, Mike Benoit wrote:

 On Fri, 2002-09-20 at 08:52, Shridhar Daithankar wrote:
 
  Besides there is issue of space. Mysql takes 1.4GB space for 1.2GB text data 
  and postgresql takes 3.2GB of space. Even with 40 bytes per row overhead 
  mentioned in FAQ, that should come to around 1.7GB, counting for 40% increase 
  in size. Vacuum was run on database.
  
 
 How did you calculate the size of database? If you used du make sure
 you do it in the data/base directory as to not include the WAL files. 

OK latest experiments, I turned number of buffers 15K and fsync is disabled.. 
Load time is now 1250 sec.

I noticed lots of notices in log saying, XLogWrite: new log files created.. I 
am pushing wal_buffers to 1000 and wal_files to 40 to test again.. I hope it 
gives me some required boost..

And BTW about disk space usage, it's 2.6G with base pg_xlog taking 65M. still 
not good..

Will keep you guys updated..

Bye
 Shridhar

--
It is necessary to have purpose.-- Alice #1, I, Mudd, stardate 4513.3


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Improving speed of copy

2002-09-20 Thread Shridhar Daithankar

Hi all,

While testing for large databases, I am trying to load 12.5M rows of data from 
a text file and it takes lot longer than mysql even with copy.

Mysql takes 221 sec. v/s 1121 sec. for postgres. For postgresql, that is around 
11.5K rows per second. Each tuple has 23 fields with fixed length of around 100 
bytes

I wrote a programs which does inserts in batches but none of thme reaches 
performance of copy. I tried 1K/5K/10K/100K rows in a transaction but it can 
not cross 2.5K rows/sec.

The machine is 800MHz, P-III/512MB/IDE disk. Postmaster is started with 30K 
buffers i.e. around 235MB buffers. Kernel caching paramaters are defaults.

Besides there is issue of space. Mysql takes 1.4GB space for 1.2GB text data 
and postgresql takes 3.2GB of space. Even with 40 bytes per row overhead 
mentioned in FAQ, that should come to around 1.7GB, counting for 40% increase 
in size. Vacuum was run on database.

Any further help? Especially if batch inserts could be speed up, that would be 
great..

Bye
 Shridhar

--
Alone, adj.:In bad company. -- Ambrose Bierce, The Devil's Dictionary


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Improving speed of copy

2002-09-20 Thread Jonah H. Harris

Are you using copy within a transaction?

I don't know how to explain the size difference tho.  I have never seen an
overhead difference that large.  What type of MySQL tables were you using
and what version?

Have you tried this with Oracle or similar commercial database?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Shridhar
Daithankar
Sent: Friday, September 20, 2002 9:52 AM
To: [EMAIL PROTECTED]
Subject: [HACKERS] Improving speed of copy


Hi all,

While testing for large databases, I am trying to load 12.5M rows of data
from
a text file and it takes lot longer than mysql even with copy.

Mysql takes 221 sec. v/s 1121 sec. for postgres. For postgresql, that is
around
11.5K rows per second. Each tuple has 23 fields with fixed length of around
100
bytes

I wrote a programs which does inserts in batches but none of thme reaches
performance of copy. I tried 1K/5K/10K/100K rows in a transaction but it can
not cross 2.5K rows/sec.

The machine is 800MHz, P-III/512MB/IDE disk. Postmaster is started with 30K
buffers i.e. around 235MB buffers. Kernel caching paramaters are defaults.

Besides there is issue of space. Mysql takes 1.4GB space for 1.2GB text data
and postgresql takes 3.2GB of space. Even with 40 bytes per row overhead
mentioned in FAQ, that should come to around 1.7GB, counting for 40%
increase
in size. Vacuum was run on database.

Any further help? Especially if batch inserts could be speed up, that would
be
great..

Bye
 Shridhar

--
Alone, adj.:In bad company. -- Ambrose Bierce, The Devil's Dictionary


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Improving speed of copy

2002-09-20 Thread Shridhar Daithankar

On 20 Sep 2002 at 21:22, Shridhar Daithankar wrote:

 Mysql takes 221 sec. v/s 1121 sec. for postgres. For postgresql, that is around 
 11.5K rows per second. Each tuple has 23 fields with fixed length of around 100 
 bytes
 
 I wrote a programs which does inserts in batches but none of thme reaches 
 performance of copy. I tried 1K/5K/10K/100K rows in a transaction but it can 
 not cross 2.5K rows/sec.

1121 sec. was time with postgres default of 64 buffers. With 30K buffers it has 
degraded to 1393 sec.

One more issue is time taken for composite index creation. It's 4341 sec as 
opposed to 436 sec for mysql. These are three non-unique character fields where 
the combination itself is not unique as well. Will doing a R-Tree index would 
be a better choice?

In select test where approx. 15 rows where reported with query on index field, 
mysql took 14 sec. and psotgresql took 17.5 sec. Not bad but other issues 
eclipse the result..

TIA once again..

Bye
 Shridhar

--
revolutionary, adj.:Repackaged.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Improving speed of copy

2002-09-20 Thread Jonah H. Harris

Also, did you disable fsync?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Jonah H. Harris
Sent: Friday, September 20, 2002 10:15 AM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [HACKERS] Improving speed of copy


Are you using copy within a transaction?

I don't know how to explain the size difference tho.  I have never seen an
overhead difference that large.  What type of MySQL tables were you using
and what version?

Have you tried this with Oracle or similar commercial database?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Shridhar
Daithankar
Sent: Friday, September 20, 2002 9:52 AM
To: [EMAIL PROTECTED]
Subject: [HACKERS] Improving speed of copy


Hi all,

While testing for large databases, I am trying to load 12.5M rows of data
from
a text file and it takes lot longer than mysql even with copy.

Mysql takes 221 sec. v/s 1121 sec. for postgres. For postgresql, that is
around
11.5K rows per second. Each tuple has 23 fields with fixed length of around
100
bytes

I wrote a programs which does inserts in batches but none of thme reaches
performance of copy. I tried 1K/5K/10K/100K rows in a transaction but it can
not cross 2.5K rows/sec.

The machine is 800MHz, P-III/512MB/IDE disk. Postmaster is started with 30K
buffers i.e. around 235MB buffers. Kernel caching paramaters are defaults.

Besides there is issue of space. Mysql takes 1.4GB space for 1.2GB text data
and postgresql takes 3.2GB of space. Even with 40 bytes per row overhead
mentioned in FAQ, that should come to around 1.7GB, counting for 40%
increase
in size. Vacuum was run on database.

Any further help? Especially if batch inserts could be speed up, that would
be
great..

Bye
 Shridhar

--
Alone, adj.:In bad company. -- Ambrose Bierce, The Devil's Dictionary


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Improving speed of copy

2002-09-20 Thread Mike Benoit

On Fri, 2002-09-20 at 08:52, Shridhar Daithankar wrote:

 Besides there is issue of space. Mysql takes 1.4GB space for 1.2GB text data 
 and postgresql takes 3.2GB of space. Even with 40 bytes per row overhead 
 mentioned in FAQ, that should come to around 1.7GB, counting for 40% increase 
 in size. Vacuum was run on database.
 

How did you calculate the size of database? If you used du make sure
you do it in the data/base directory as to not include the WAL files. 


-- 
Best Regards,
 
Mike Benoit
NetNation Communication Inc.
Systems Engineer
Tel: 604-684-6892 or 888-983-6600
 ---
 
 Disclaimer: Opinions expressed here are my own and not 
 necessarily those of my employer


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Improving speed of copy

2002-09-20 Thread Nigel J. Andrews

On Fri, 20 Sep 2002, Shridhar Daithankar wrote:

 In select test where approx. 15 rows where reported with query on index field, 
 mysql took 14 sec. and psotgresql took 17.5 sec. Not bad but other issues 
 eclipse the result..

I don't know about anyone else but I find this aspect strange. That's 1 second
(approx.) per row retrieved. That is pretty dire for an index scan. The
data/index must be very non unique.


-- 
Nigel J. Andrews


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Improving speed of copy

2002-09-20 Thread Joe Conway

Nigel J. Andrews wrote:
 On Fri, 20 Sep 2002, Shridhar Daithankar wrote:
 
In select test where approx. 15 rows where reported with query on index field, 
mysql took 14 sec. and psotgresql took 17.5 sec. Not bad but other issues 
eclipse the result..
 
 I don't know about anyone else but I find this aspect strange. That's 1 second
 (approx.) per row retrieved. That is pretty dire for an index scan. The
 data/index must be very non unique.
 

Yeah, I'd agree that is strange. Can we see EXPLAIN ANALYZE for that query.

Also, in one of your ealier posts you mentioned a slowdown after raising 
shared buffers from the default 64 to 3. You might have driven the machine 
into swapping. Maybe try something more like 1 - 15000.

HTH,

Joe


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html