Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
On Sat, Apr 26, 2008 at 9:25 AM, Adonias Malosso <[EMAIL PROTECTED]> wrote: > I´d like to know what´s the best practice to LOAD a 70 milion rows, 101 > columns table > from ORACLE to PGSQL. The fastest and easiest method would be to dump the data from Oracle into CSV/delimited format using something like ociuldr (http://www.anysql.net/en/ociuldr.html) and load it back into PG using pg_bulkload (which is a helluva lot faster than COPY). Of course, you could try other things as well... such as setting up generic connectivity to PG and inserting the data to a PG table over the database link. Similarly, while I hate to see shameless self-plugs in the community, the *fastest* method you could use is dblink_ora_copy, contained in EnterpriseDB's PG+ Advanced Server; it uses an optimized OCI connection to COPY the data directly from Oracle into Postgres, which also saves you the intermediate step of dumping the data. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837 | http://www.enterprisedb.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search
Robert Treat <[EMAIL PROTECTED]> writes: > On Saturday 26 April 2008 13:26, Tom Lane wrote: >> Oh, you failed to state that the dtrace output was post-patch. You need >> to show *pre* patch dtrace output if you want us to think it relevant. > Please read up-thread. Sorry, I'd forgotten your previous post. I poked around for calls to TransactionIdIsCurrentTransactionId that are in current code and weren't in 8.1. I found these: src/backend/commands/analyze.c: 965:if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(targtuple.t_data))) src/backend/commands/analyze.c: 984:if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(targtuple.t_data))) src/backend/commands/cluster.c: 803:if (!TransactionIdIsCurrentTransactionId( src/backend/commands/cluster.c: 816:if (!TransactionIdIsCurrentTransactionId( src/backend/storage/ipc/procarray.c: 374: if (TransactionIdIsCurrentTransactionId(xid)) src/backend/utils/time/combocid.c: 108: Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tup))); src/backend/utils/time/combocid.c: 123: Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tup))); src/backend/utils/time/combocid.c: 156: TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tup))) The ANALYZE and CLUSTER calls are not likely to be your issue, but the one in HeapTupleHeaderAdjustCmax could get called a lot, and the one in TransactionIdIsInProgress definitely will get called a lot. Neither of those calls existed in 8.2. So I think that explains why TransactionIdIsCurrentTransactionId has become more performance-critical in 8.3 than it was before. Will apply the back-patch. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
Joshua D. Drake wrote: Potluri Srikanth wrote: But do we link oracle trigger to postgres trigger ? i mean : oracle trigger will take a note of what has been changed . but then how do we pass those changes to postgres trigger ? I am assuming you can use the java trigger from oracle to load the postgresql jdbc driver, make a connection to postgresql and perform whatever statement needed to be done. Note that this will be rather inefficient if you're obtaining a new connection every time. It looks like Oracle's Java stored procedures and triggers run in an appserver-like environment, though, so you should be able to use a connection pool, JNDI, or similar. Some Java stored procedure examples: http://www.oracle.com/technology/sample_code/tech/java/jsp/oracle9ijsp.html You could also use a Java trigger to send simpler change message, with a serialized row if required, to an external app that's responsible for updating the PostgreSQL database. That might cause less load on the DB server. The trouble with this approach, though, is that it might be hard to get right when transactions roll back. An alternative is to use an Oracle trigger that inserts records in a change tracking / audit table. You can then periodically read and clear the audit table, using that change history data to update the PostgreSQL database. This method has the advantage of being transaction safe, as data will never become visible in the audit table until the transaction making the changes has committed. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search
On Saturday 26 April 2008 13:26, Tom Lane wrote: > Robert Treat <[EMAIL PROTECTED]> writes: > > On Friday 25 April 2008 17:32, Tom Lane wrote: > >> Robert Treat <[EMAIL PROTECTED]> writes: > >>> Oddly some dtrace profiling gave me this, which is pretty different, > >>> but certainly doesn't have concerns about > >>> TransactionIdIsCurrentTransactionId > >> > >> which seems to pretty much destroy your thesis, no? > > > > How so? Before the patch we bog down for hours, spending 99% of our time > > in TransactionIdIsCurrentTransactionId, after the patch everything > > performs well (really better than before) and we spend so little time in > > TransactionIdIsCurrentTransactionId it barely shows up on the radar. > > Oh, you failed to state that the dtrace output was post-patch. You need > to show *pre* patch dtrace output if you want us to think it relevant. > Please read up-thread. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search
Robert Treat <[EMAIL PROTECTED]> writes: > On Friday 25 April 2008 17:32, Tom Lane wrote: >> Robert Treat <[EMAIL PROTECTED]> writes: >>> Oddly some dtrace profiling gave me this, which is pretty different, but >>> certainly doesn't have concerns about TransactionIdIsCurrentTransactionId >> >> which seems to pretty much destroy your thesis, no? > How so? Before the patch we bog down for hours, spending 99% of our time in > TransactionIdIsCurrentTransactionId, after the patch everything performs well > (really better than before) and we spend so little time in > TransactionIdIsCurrentTransactionId it barely shows up on the radar. Oh, you failed to state that the dtrace output was post-patch. You need to show *pre* patch dtrace output if you want us to think it relevant. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
Yep just do something like this within sqlplus (from http://www.dbforums.com/showthread.php?t=350614): set termout off set hea off set pagesize 0 spool c:\whatever.csv select a.a||','||a.b||','||a.c from a where a.a="whatever"; spool off COPY is the fastest approach to get it into PG. - Luke On 4/26/08 6:25 AM, "Adonias Malosso" <[EMAIL PROTECTED]> wrote: > Hi All, > > I´d like to know what´s the best practice to LOAD a 70 milion rows, 101 > columns table > from ORACLE to PGSQL. > > The current approach is to dump the data in CSV and than COPY it to > Postgresql. > > Anyone has a better idea. > > > Regards > Adonias Malosso >
Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
Potluri Srikanth wrote: But do we link oracle trigger to postgres trigger ? i mean : oracle trigger will take a note of what has been changed . but then how do we pass those changes to postgres trigger ? I am assuming you can use the java trigger from oracle to load the postgresql jdbc driver, make a connection to postgresql and perform whatever statement needed to be done. Sincerely, Joshua D. Drake P.S. It is possible that Oracle can't do this (I don't know) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
But do we link oracle trigger to postgres trigger ? i mean : oracle trigger will take a note of what has been changed . but then how do we pass those changes to postgres trigger ? can u suggest any logic or algorithm ? Regards, Srikanth k Potluri +63 9177444783(philippines) On Sat 26/04/08 8:40 PM , "Joshua D. Drake" [EMAIL PROTECTED] sent: Adonias Malosso wrote: > Hi All, > > I�d like to know what�s the best practice to LOAD a 70 milion rows, 101 > columns table > from ORACLE to PGSQL. > > The current approach is to dump the data in CSV and than COPY it to > Postgresql. > > Anyone has a better idea. Write a java trigger in Oracle that notes when a row has been added/delete/updated and does the exact same thing in postgresql. Joshua D. Drake > > > Regards > Adonias Malosso -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org [1]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance Links: -- [1] http://sitemail7.hostway.com/javascript:top.opencompose(\'[EMAIL PROTECTED]',\'\',\'\',\'\')
Re: [PERFORM] Best practice to load a huge table from ORACLE to PG
Adonias Malosso wrote: Hi All, I´d like to know what´s the best practice to LOAD a 70 milion rows, 101 columns table from ORACLE to PGSQL. The current approach is to dump the data in CSV and than COPY it to Postgresql. Anyone has a better idea. Write a java trigger in Oracle that notes when a row has been added/delete/updated and does the exact same thing in postgresql. Joshua D. Drake Regards Adonias Malosso -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On 22.04.2008, at 17:25, Scott Marlowe wrote: On Tue, Apr 22, 2008 at 7:42 AM, Thomas Spreng <[EMAIL PROTECTED]> wrote: I think I'll upgrade PostgreSQL to the latest 8.3 version in the next few days anyway, along with a memory upgrade (from 1.5GB to 4GB) and a new 2x RAID-1 (instead of RAID-5) disk configuration. I hope that this has already a noticeable impact on the performance. Note that if you have a good RAID controller with battery backed cache and write back enabled, then you're probably better or / at least as well off using four disks in a RAID-10 than two separate RAID-1 sets (one for xlog and one for data). I just wanted to let you know that upgrading Postgres from 8.1 to 8.3, RAM from 1.5GB to 4GB and changing from a 3 disk RAID5 to 2x RAID1 (OS & WAL, Tablespace) led to a significant speed increase. What's especially important is that those randomly slow queries seem to be gone (for now). Thanks for all the help. Cheers, Tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Best practice to load a huge table from ORACLE to PG
Hi All, I´d like to know what´s the best practice to LOAD a 70 milion rows, 101 columns table from ORACLE to PGSQL. The current approach is to dump the data in CSV and than COPY it to Postgresql. Anyone has a better idea. Regards Adonias Malosso
Re: [PERFORM] Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search
On Friday 25 April 2008 17:32, Tom Lane wrote: > Robert Treat <[EMAIL PROTECTED]> writes: > > Oddly some dtrace profiling gave me this, which is pretty different, but > > certainly doesn't have concerns about TransactionIdIsCurrentTransactionId > > which seems to pretty much destroy your thesis, no? > How so? Before the patch we bog down for hours, spending 99% of our time in TransactionIdIsCurrentTransactionId, after the patch everything performs well (really better than before) and we spend so little time in TransactionIdIsCurrentTransactionId it barely shows up on the radar. Note I'm open to the idea that TransactionIdIsCurrentTransactionId itself is not the problem, but that something else changed between 8.1 and 8.3 that exposes TransactionIdIsCurrentTransactionId as a problem. Changing to a binary search for TransactionIdIsCurrentTransactionId makes that a non-issue though. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance