[ 
https://issues.apache.org/jira/browse/NIFI-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643447#comment-16643447
 ] 

ASF GitHub Bot commented on NIFI-5642:
--------------------------------------

Github user mattyb149 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/3051#discussion_r223701761
  
    --- Diff: 
nifi-nar-bundles/nifi-cassandra-bundle/nifi-cassandra-processors/src/main/java/org/apache/nifi/processors/cassandra/QueryCassandra.java
 ---
    @@ -192,15 +205,17 @@ public void onScheduled(final ProcessContext context) 
{
         @Override
         public void onTrigger(final ProcessContext context, final 
ProcessSession session) throws ProcessException {
             FlowFile fileToProcess = null;
    +        FlowFile inputFlowFile = null;
             if (context.hasIncomingConnection()) {
    -            fileToProcess = session.get();
    +            inputFlowFile = session.get();
     
                 // If we have no FlowFile, and all incoming connections are 
self-loops then we can continue on.
                 // However, if we have no FlowFile and we have connections 
coming from other Processors, then
                 // we know that we should run only if we have a FlowFile.
    -            if (fileToProcess == null && context.hasNonLoopConnection()) {
    +            if (inputFlowFile == null && context.hasNonLoopConnection()) {
                     return;
                 }
    +            session.remove(inputFlowFile);
    --- End diff --
    
    I don't follow the logic here. If there is a flow file in the incoming 
connection,  this appears to remove it from the session, and fileToProcess will 
always be null, which means we couldn't use flow file attributes to evaluate 
properties such as CQL Query, Query Timeout, Charset, etc. Another effect is 
that provenance/lineage will not be preserved for incoming files, as the 
incoming file will be removed, and any flow files generated by this processor 
will appear to have been created here, so you can't track that a flow file "A" 
came in and, as a result, generated flow files X,Y,Z.


> QueryCassandra processor : output FlowFiles as soon fetch_size is reached
> -------------------------------------------------------------------------
>
>                 Key: NIFI-5642
>                 URL: https://issues.apache.org/jira/browse/NIFI-5642
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.7.1
>            Reporter: André Gomes Lamas Otero
>            Priority: Major
>
> When I'm using QueryCassandra alongside with fetch_size parameter I expected 
> that as soon my reader reaches the fetch_size the processor outputs some data 
> to be processed by the next processor, but QueryCassandra reads all the data, 
> then output the flow files.
> I'll start to work on a patch for this situation, I'll appreciate any 
> suggestion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to