[rhino-tools-dev] Running several etl processes at once

Miles Waller Tue, 18 Jan 2011 03:34:26 -0800

Hi,

I have a long-running process (about an hour per night) to load and
transform data for a data warehouse.  However this is mostly because I run
it completely single-threaded (to avoid memory issues), plus the entire run
consists of several ETLs which I run one at a time rather than several at
once.  In this arrangement, the entire process appears to be IO-bound due to
the drive speed when writing the final output files (text format) which are
needed for audit purposes.


The machine has several separate drives so I thought I could get a
significant boost by running several ETLs at once, on separate threads, and
arranging for all the input/output files to be on different drives.  So far
it looks quite promising.  However, am I likely to run into any threading
issues with this configuration that might cause bad data to come out?  I
don't think so but given how hard it might be to spot until it's too late, I
wondered if anyone else had an opinion.

Cheers,

Miles

-- 
You received this message because you are subscribed to the Google Groups 
"Rhino Tools Dev" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/rhino-tools-dev?hl=en.

[rhino-tools-dev] Running several etl processes at once

Reply via email to