I have multiple ETL processes that are inserting data at the same time.  Some 
of the tables that this data is being inserted into have FK constraints that 
must be maintained during the ETL process.  See pseudocode below for an example:

create table employee (
   employee_id integer,
   employee_nbr integer unique constraint,
   employee_name varchar(50));

create table sales (
   sales_id integer,
   employee_id integer references employee.employee_id,
   units integer);

create table time_clock (
   time_clock_id integer,
   employee_id integer references employee.employee_id,
   hours integer);

During the ETL process for Sales, we need to verify that the Employee record 
already exists, and if not then we insert it before writing the Sales record so 
that we can satisfy the FK.  The same process happens during the import of the 
TimeClock data.

The problem occurs when a new employee appears in both the Sales and TimeClock 
data and are processed at the same time in different transactions. The process 
looks like this:

T1: Start transaction
T2: Start transaction
T1: Process Sale record
T1: Verify Employee - does not exist
T1: Insert Employee record
T1: Insert Sale record
T1: Process next Sale record
T2: Process Timeclock record
T2: Verify Employee - does not exist (can't see other transaction)
T2: Insert Employee record - violation of unique constraint

Both transactions are using READ_COMMITTED at this point, which is of course 
the problem.  I can't commit either transaction because the entire Sale or 
Timeclock dataset must succeed or fail.

Any ideas on how to do this type of process successfully?

Thank you.

Reply via email to