Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
Jonah, Thank you for the answer. Good to know about this enterprise DB feature. I´ll follow using pgloader. Regards. Adonias Malosso On Sat, Apr 26, 2008 at 10:14 PM, Jonah H. Harris [EMAIL PROTECTED] wrote: On Sat, Apr 26, 2008 at 9:25 AM, Adonias Malosso [EMAIL PROTECTED] wrote: I´d like to know what´s the best practice to LOAD a 70 milion rows, 101 columns table from ORACLE to PGSQL. The fastest and easiest method would be to dump the data from Oracle into CSV/delimited format using something like ociuldr (http://www.anysql.net/en/ociuldr.html) and load it back into PG using pg_bulkload (which is a helluva lot faster than COPY). Of course, you could try other things as well... such as setting up generic connectivity to PG and inserting the data to a PG table over the database link. Similarly, while I hate to see shameless self-plugs in the community, the *fastest* method you could use is dblink_ora_copy, contained in EnterpriseDB's PG+ Advanced Server; it uses an optimized OCI connection to COPY the data directly from Oracle into Postgres, which also saves you the intermediate step of dumping the data. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837 | http://www.enterprisedb.com/
[PERFORM] Best practice to load a huge table from ORACLE to PG
Hi All, I´d like to know what´s the best practice to LOAD a 70 milion rows, 101 columns table from ORACLE to PGSQL. The current approach is to dump the data in CSV and than COPY it to Postgresql. Anyone has a better idea. Regards Adonias Malosso
[PERFORM] [PERFORMANCE] Error loading 37G CSV file invalid string enlargement request size 65536
Hi all, I´m running a copy for a 37G CSV and receiving the following error: invalid string enlargement request size 65536 The file has about 70 million lines with 101 columns, all them varchar. When I run the command with the whole file i receive the error after loading about 29million lines. So i´ve spllited the file in 10 million lines with split: split --lines=1000 And running the copy i receive the error on the 5th file: psql:/srv/www/htdocs/import/script_q2.sql:122: ERROR: invalid string enlargement request size 65536 CONTEXT: COPY temp_q2, line 3509639: 9367276;4;DANIEL DO CARMO BARROS;31-Jan-1986;M;1;10;3162906;GILSON TEIXEIRA... Any clues? My postgresql version is 8.2.4 the server is running suse linux with 1.5GB Sensitive changes in postgresql.conf are: shared_buffers = 512MB temp_buffers = 256MB checkpoint_segments = 60 I´d also like to know if there´s any way to optimize huge data load in operations like these. Regards Adonias Malosso
[PERFORM] 4s query want to run faster
Hi all, The following query takes about 4s to run in a 16GB ram server. Any ideas why it doesn´t use index for the primary keys in the join conditions? select i.inuid, count(*) as total from cte.instrumentounidade i inner join cte.pontuacao p on p.inuid = i.inuid inner join cte.acaoindicador ai on ai.ptoid = p.ptoid inner join cte.subacaoindicador si on si.aciid = ai.aciid where i.itrid = 2 and p.ptostatus = 'A' group by i.inuid having count(*) 0 HashAggregate (cost=47905.87..47941.01 rows=2008 width=4) Filter: (count(*) 0) - Hash Join (cost=16307.79..46511.45 rows=185923 width=4) Hash Cond: (si.aciid = ai.aciid) - Seq Scan on subacaoindicador si (cost=0.00..22812.17 rows=368817 width=4) - Hash (cost=16211.40..16211.40 rows=38556 width=8) - Hash Join (cost=9018.20..16211.40 rows=38556 width=8) Hash Cond: (p.inuid = i.inuid) - Hash Join (cost=8908.41..15419.10 rows=39593 width=8) Hash Cond: (ai.ptoid = p.ptoid) - Seq Scan on acaoindicador ai (cost= 0.00..4200.84 rows=76484 width=8) - Hash (cost=8678.33..8678.33 rows=92034 width=8) - Seq Scan on pontuacao p (cost= 0.00..8678.33 rows=92034 width=8) Filter: (ptostatus = 'A'::bpchar) - Hash (cost=104.78..104.78 rows=2008 width=4) - Seq Scan on instrumentounidade i (cost= 0.00..104.78 rows=2008 width=4) Filter: (itrid = 2)
Re: [PERFORM] 4s query want to run faster
Set random_page_cost = 2 solved the problem. thanks On Thu, Feb 21, 2008 at 6:16 PM, Claus Guttesen [EMAIL PROTECTED] wrote: why it doesn´t use index for the primary keys in the join conditions? Maby random_page_cost is set too high? What version are you using? Postgresql v. 8.2.1 You can try to lower this value. The default (in 8.3) is 4. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare