[jira] [Commented] (NIFI-5788) Introduce batch size limit in PutDatabaseRecord processor

ASF GitHub Bot (JIRA) Tue, 06 Nov 2018 06:56:09 -0800


    [ 
https://issues.apache.org/jira/browse/NIFI-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676856#comment-16676856
 ]


ASF GitHub Bot commented on NIFI-5788:
--------------------------------------

Github user patricker commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/3128#discussion_r231153599
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java
 ---
    @@ -669,11 +685,20 @@ private void executeDML(ProcessContext context, 
ProcessSession session, FlowFile
                             }
                         }
                         ps.addBatch();
    +                    if (++currentBatchSize == batchSize) {
    --- End diff --
    
    True, I missed that override before, but I see it now. So definitely less 
valuable, the only thing it would provide would be troubleshooting guidance, 
"your bad data is roughly in this part of the file". Probably not worth it. 
Thanks!


> Introduce batch size limit in PutDatabaseRecord processor
> ---------------------------------------------------------
>
>                 Key: NIFI-5788
>                 URL: https://issues.apache.org/jira/browse/NIFI-5788
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>         Environment: Teradata DB
>            Reporter: Vadim
>            Priority: Major
>              Labels: pull-request-available
>
> Certain JDBC drivers do not support unlimited batch size in INSERT/UPDATE 
> prepared SQL statements. Specifically, Teradata JDBC driver 
> ([https://downloads.teradata.com/download/connectivity/jdbc-driver)] would 
> fail SQL statement when the batch overflows the internal limits.
> Dividing data into smaller chunks before the PutDatabaseRecord is applied can 
> work around the issue in certain scenarios, but generally, this solution is 
> not perfect because the SQL statements would be executed in different 
> transaction contexts and data integrity would not be preserved.
> The solution suggests the following:
>  * introduce a new optional parameter in *PutDatabaseRecord* processor, 
> *max_batch_size* which defines the maximum batch size in INSERT/UPDATE 
> statement; the default value zero (INFINITY) preserves the old behavior
>  * divide the input into batches of the specified size and invoke 
> PreparedStatement.executeBatch()  for each batch
> Pull request: [https://github.com/apache/nifi/pull/3128]
>  
> [EDIT] Changed batch_size to max_batch_size. The default value would be zero 
> (INFINITY) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-5788) Introduce batch size limit in PutDatabaseRecord processor

Reply via email to