> On jún. 28, 2017, 11:14 de, Peter Bacsko wrote: > > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java > > Lines 353 (patched) > > <https://reviews.apache.org/r/52782/diff/3/?file=1763480#file1763480line374> > > > > Do we have to instantiate the batch handling mechanism just for the > > sake of a Tx begin/commit? > > András Piros wrote: > I think `BatchTransactionGuard` is pretty lightweight, and it delivers > some statistics that may be of good use. It's also a good practice to have > appropriate levels of abstraction, and use them.
To me it's a bit distracting. If a read the code, it's not immediately clear why we have a "batch handler" if we don't do any batching. We already have the entityManager instance in use in the same scope. > On jún. 28, 2017, 11:14 de, Peter Bacsko wrote: > > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java > > Lines 395 (patched) > > <https://reviews.apache.org/r/52782/diff/3/?file=1763480#file1763480line416> > > > > Define the size of the list? > > András Piros wrote: > To know the exact size we need to iterate through the file anyway, > unfortunately. In practice I didn't encounter measurable performance > degradation because of using an auto-growing `ArrayList`. Was thinking about a practical value like 1024 or 2048, but nevermind. - Peter ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52782/#review179086 ----------------------------------------------------------- On júl. 4, 2017, 10:09 de, András Piros wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/52782/ > ----------------------------------------------------------- > > (Updated júl. 4, 2017, 10:09 de) > > > Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and > Robert Kanter. > > > Repository: oozie-git > > > Description > ------- > > We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap > size. Furthermore, of at least one row inside that tx fails, we retry the > whole batch into separate `EntityTransaction`s each. > > Following error handling is implemented: > > 1. check if all necessary tables are present and empty > 2. rows are imported till the end even if there are skipped rows in the > meanwhile > 3. if at least one row is skipped in the meanwhile for some > `ConstraintViolationException`, we delete all rows of all necessary tables. > That enables the user to have the log messages of all the erroneous rows in > one run, and Oozie database is never in an inconsistent state of some rows > present, some not present of an import > > > Diffs > ----- > > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java > 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 > tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java > c43223ef05aa702be49565ba2626314628e63749 > tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION > tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION > tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION > tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION > tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION > tools/src/test/resources/dumpData/ooziedb_ac.json > tools/src/test/resources/dumpData/ooziedb_bna.json > tools/src/test/resources/dumpData/ooziedb_bnj.json > tools/src/test/resources/dumpData/ooziedb_ca.json > tools/src/test/resources/dumpData/ooziedb_cj.json > tools/src/test/resources/dumpData/ooziedb_slareg.json > tools/src/test/resources/dumpData/ooziedb_slasum.json > tools/src/test/resources/dumpData/ooziedb_sysinfo.json > tools/src/test/resources/dumpData/ooziedb_wf.json > tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION > tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION > tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION > tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION > > > Diff: https://reviews.apache.org/r/52782/diff/4/ > > > Testing > ------- > > See `TestDBLoadDump` for further reference. > > > Thanks, > > András Piros > >