Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1115#discussion_r86138383
  
    --- Diff: 
nifi-commons/nifi-processor-utilities/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java
 ---
    @@ -273,25 +262,26 @@ private int binFlowFiles(final ProcessContext 
context, final ProcessSessionFacto
                 }
     
                 final ProcessSession session = sessionFactory.createSession();
    -            FlowFile flowFile = session.get();
    -            if (flowFile == null) {
    +            final List<FlowFile> flowFiles = session.get(1000);
    +            if (flowFiles.isEmpty()) {
                     break;
                 }
     
    -            flowFile = this.preprocessFlowFile(context, session, flowFile);
    -
    -            String groupId = this.getGroupId(context, flowFile);
    -
    -            final boolean binned = binManager.offer(groupId, flowFile, 
session);
    -
    -            // could not be added to a bin -- probably too large by 
itself, so create a separate bin for just this guy.
    -            if (!binned) {
    -                Bin bin = new Bin(0, Long.MAX_VALUE, 0, Integer.MAX_VALUE, 
null);
    -                bin.offer(flowFile, session);
    -                this.readyBins.add(bin);
    +            final Map<String, List<FlowFile>> flowFileGroups = new 
HashMap<>();
    +            for (FlowFile flowFile : flowFiles) {
    +                flowFile = this.preprocessFlowFile(context, session, 
flowFile);
    +                final String groupingIdentifier = getGroupId(context, 
flowFile);
    +                flowFileGroups.computeIfAbsent(groupingIdentifier, id -> 
new ArrayList<>()).add(flowFile);
                 }
     
    -            flowFilesBinned++;
    +            for (final Map.Entry<String, List<FlowFile>> entry : 
flowFileGroups.entrySet()) {
    +                final Set<FlowFile> unbinned = 
binManager.offer(entry.getKey(), entry.getValue(), session, sessionFactory);
    +                for (final FlowFile flowFile : unbinned) {
    +                    Bin bin = new Bin(session, 0, Long.MAX_VALUE, 0, 
Integer.MAX_VALUE, null);
    +                    bin.offer(flowFile, session);
    +                    this.readyBins.add(bin);
    +                }
    +            }
    --- End diff --
    
    Not exactly. The loop above says "if the bin manager didn't bin it for 
whatever reason, create our own one-element bin and process it (by adding to 
this.readyBins) - nothing else will go in this bin." In BinManager:201, it is 
saying "if the FlowFile didn't fit in any of the bins that are available, 
create a new bin and add this FlowFile to it. Subsequent FlowFiles may then go 
into this bin."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to