Hi, My only concern would be that someone might come along and use the > application to insert a person while my ETL process is running, > causing one of my inserts to fail. I guess I could trap that exception > and update my internal ID counter or something... >
The bulk upload is not easy to troubleshoot - it's error messages are at best opaque... Thinking off the top of my head, if you split the process into separate transform and load, presuming transform takes the time and load is quick, then that should reduce the time window for such a problem to occur. You could find a base key value at the start of the load, and add this value to all the calculated key values as you load the rows in (locking the table at the same time). > My last question is regarding your statement "and then use the output > of the 'person' pipeline as input to a join in the 'person_phone' > pipeline". I thought joins were for taking two rows with different > columns and joining them into a merged row with all of the columns. Is > there an example anywhere of using joins to represent parent/child, > one-to-many relationships? > Yes, but if I remember correctly the rows are cached as they are loaded, so you can have multiple matches... My memory is a little hazy, so perhaps check the code? (Also, if so, ignore my comment about memory use - sorting will probably give you quicker lookups, but won't stop rows building up in memory) Miles -- You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en.
