[ 
https://issues.apache.org/jira/browse/NIFI-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675386#comment-16675386
 ] 

ASF GitHub Bot commented on NIFI-5788:
--------------------------------------

Github user mattyb149 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/3128#discussion_r230812123
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java
 ---
    @@ -265,6 +265,17 @@
                 
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
                 .build();
     
    +    static final PropertyDescriptor BATCH_SIZE = new 
PropertyDescriptor.Builder()
    +            .name("put-db-record-batch-size")
    +            .displayName("Bulk Size")
    +            .description("Specifies batch size for INSERT and UPDATE 
statements. This parameter has no effect for other statements specified in 
'Statement Type'."
    +                    + " Non-positive value has the effect of infinite bulk 
size.")
    +            .defaultValue("-1")
    --- End diff --
    
    What does a value of zero do? Would anyone ever use it? If not, perhaps 
zero is the best default to indicate infinite bulk size. If you do change it to 
zero, please change the validator to a NONNEGATIVE_INTEGER_VALIDATOR to match


> Introduce batch size limit in PutDatabaseRecord processor
> ---------------------------------------------------------
>
>                 Key: NIFI-5788
>                 URL: https://issues.apache.org/jira/browse/NIFI-5788
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>         Environment: Teradata DB
>            Reporter: Vadim
>            Priority: Major
>              Labels: pull-request-available
>
> Certain JDBC drivers do not support unlimited batch size in INSERT/UPDATE 
> prepared SQL statements. Specifically, Teradata JDBC driver 
> ([https://downloads.teradata.com/download/connectivity/jdbc-driver)] would 
> fail SQL statement when the batch overflows the internal limits.
> Dividing data into smaller chunks before the PutDatabaseRecord is applied can 
> work around the issue in certain scenarios, but generally, this solution is 
> not perfect because the SQL statements would be executed in different 
> transaction contexts and data integrity would not be preserved.
> The solution suggests the following:
>  * introduce a new optional parameter in *PutDatabaseRecord* processor, 
> *batch_size* which defines the maximum size of the bulk in INSERT/UPDATE 
> statement; its default value is -1 (INFINITY) preserves the old behavior
>  * divide the input into batches of the specified size and invoke 
> PreparedStatement.executeBatch()  for each batch
> Pull request: [https://github.com/apache/nifi/pull/3128]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to