Re: Reading flowfile in a stream callback

2017-11-08 Thread James McMahon
Thank you Andy, thank you again Joe. I'll rethink my approach based on your recommendations. -Jim On Fri, Nov 3, 2017 at 1:31 PM, Andy LoPresto wrote: > James, > > I am not a Python expert, so I’m glad other people could weigh in. As far > as routing on content type, I agree with Joe’s sentimen

Re: Reading flowfile in a stream callback

2017-11-03 Thread Andy LoPresto
James, I am not a Python expert, so I’m glad other people could weigh in. As far as routing on content type, I agree with Joe’s sentiment that IdentifyMimeType and RouteOnAttribute are the correct solutions there. You can route on a range of input options (the actual type, detected charset, etc

Re: Reading flowfile in a stream callback

2017-11-03 Thread Joe Witt
Mime type detection can be difficult business but I trust Apache Tika to do a far better job than I ever could. The result you show for JSON appears correct and I'd simply add that string to the list of routing attributes that i treat as text. Or I'd key off the charset being being provided as th

Re: Reading flowfile in a stream callback

2017-11-03 Thread James McMahon
I've always found that IdentifyMimeType returns a wide, wide range of values for mime.type. There is often ambiguity that mime.type is a reliable indicator of the nature of the content. To illustrate, I've passed file.txt into Nifi that contains a string representation of json. I'd expect this to b

Re: Reading flowfile in a stream callback

2017-11-03 Thread Matt Burgess
Jim, 1) The line text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) is for reading the whole content in as a UTF-8 encoded string. The inputStream itself deals in bytes [1] , so you could use available() and read() to get the binary data. 2) If you are not dealing in strings/text, you

Re: Reading flowfile in a stream callback

2017-11-03 Thread Joe Witt
"How can discern binary or character content using conditional checks to be sure I handle the file properly?" Use NiFi and the existing processors where able and extend/script only where necessary/critical. For the case you mention use IdentifyMimeType and route appropriate data to the appropriat

Re: Reading flowfile in a stream callback

2017-11-03 Thread James McMahon
Andy, regarding the the code sample you offered above - doesn't this put into text both the attributes metadata and the payload of the flowfile? If that is the case, how does one modify that to read in from the stream into variable text only the file payload? On Fri, Nov 3, 2017 at 5:48 AM, James

Re: Reading flowfile in a stream callback

2017-11-03 Thread James McMahon
Thank you Andy. I'd like to ask just a few quick follow up questions. 1- My flow content may be textual characters, and it can also be binary - jpgs, pngs, and similar. How can discern binary or character content using conditional checks to be sure I handle the file properly? How would I alter thi

Re: Reading flowfile in a stream callback

2017-11-02 Thread Andy LoPresto
James, The Python API should be the same as the Java FlowFile.java interface [1]. Matt Burgess’ blog has a good post about using Jython to do flowfile content manipulation. Something like: flowFile = session.get() if (flowFile != None): flowFile = session.write(flowFile,PyStreamCallback())

Reading flowfile in a stream callback

2017-11-02 Thread James McMahon
In python, I can use the requests library to post content something like htis: import requests url="https://abc.test.org"; files={'file':open('/somedir/myfile.txt','rb')} r = requests.post(url,files=files) If I am in a python stream callback, how can I read the flowfile payload in the same way th