[
https://issues.apache.org/jira/browse/OOZIE-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510138#comment-15510138
]
Andras Piros commented on OOZIE-2662:
-------------------------------------
The [*current
patch*|https://issues.apache.org/jira/secure/attachment/12829584/OOZIE-2662.002.wip.patch]
addresses duplicate entries by skipping only the rows that violate the primary
key constraint enforced by {{@Id}}. Please see
{{TestDBLoadDump.testSecondImportDoesNotImportDuplicates()}} for details.
The only problem with that is when there are rows with any other type of
constraint violations we get the same behavior: only violating rows are skipped
while performing the import process. Please see
{{TestDBLoadDump.testImportSkipsRowsContainingInvalidData()}} for details.
Using OpenJPA we cannot distinguish between different types of constraint
violations (due to {{@Id}} or {{@Length}}, for example) - OpenJPA wraps both
inside the very same {{RollbackException}} using the very same mechanisms.
So the question is, *[~rkanter]* and *[~jaydeepvishwakarma]*, should we go on
like that, or should we not skip but halt the whole import process on both
duplicate and otherwise violating rows?
> DB migration fails if DB is too big
> -----------------------------------
>
> Key: OOZIE-2662
> URL: https://issues.apache.org/jira/browse/OOZIE-2662
> Project: Oozie
> Issue Type: Bug
> Reporter: Peter Cseh
> Assignee: Andras Piros
> Attachments: OOZIE-2662.001.patch, OOZIE-2662.002.wip.patch
>
>
> The initial version of the DB import tool commits all the workflows, actions
> etc. in one huge commit. If it does not fits into the memory, AOOME is thrown.
> We should commit every 1k or 10k elements to prevent this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)