Github user jvwing commented on the issue:

    https://github.com/apache/nifi/pull/929
  
    Thanks for those improvements, @jdye64, I especially like the updated usage 
doc.  Two things on the latest code:
    
    1. Did you try an .xls file?  There is a problem when the flowfile 
attribute is added in the catch block on line ~195.  The NiFi framework throws 
an exception of it's own, because we can't do `session.putAttribute` inside an 
InputStreamCallback for the same flowfile:
    
    > java.lang.IllegalStateException: 
StandardFlowFileRecord[uuid=be192381-9475-4c6d-a6ca-43735e5df271,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1489713904793-1, container=default, 
section=1], offset=0, 
length=26112],offset=0,name=./conf/test-xls.xls,size=26112] already in use for 
an active callback or InputStream created by ProcessSession.read(FlowFile) has 
not been closed
    
    Something similar happens with the session.putAttribute on ~209.  As a 
result of these exceptions, the session is rolled back and the flowfile is 
returned to the input queue.  I think we can throw an exception, though.  So if 
we caught and rethrew with a different error message, it should work out.
    
    2. In the failure case, we're routing the flowfile to both 'failure' and 
'original'.  I didn't realize it earlier, but I now believe this to be unusual 
in NiFi.  Most processors treat failure as an exclusive route, and 'original' 
as part of the successful happy path.  SplitAvro, SplitJson, SplitText, and 
UnpackContent were some examples I looked at.  I doubt that's written in stone. 
 What do you think?
    
    I made a [sample code 
fork](https://github.com/jvwing/nifi/commit/2ccf5dec2dcd707c5963716dfb3fbf7813c460ea)
 with a unit test for .xls and a suggested approach to solving the 
IllegalStateExceptions, and the failure routing.  I did not get the logging to 
cooperate the way I think it should, but we're not too far off.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to