Thank you Andy, thank you again Joe. I'll rethink my approach based on your
recommendations. -Jim
On Fri, Nov 3, 2017 at 1:31 PM, Andy LoPresto wrote:
> James,
>
> I am not a Python expert, so I’m glad other people could weigh in. As far
> as routing on content type, I agree with Joe’s sentimen
James,
I am not a Python expert, so I’m glad other people could weigh in. As far as
routing on content type, I agree with Joe’s sentiment that IdentifyMimeType and
RouteOnAttribute are the correct solutions there. You can route on a range of
input options (the actual type, detected charset, etc
Mime type detection can be difficult business but I trust Apache Tika
to do a far better job than I ever could. The result you show for
JSON appears correct and I'd simply add that string to the list of
routing attributes that i treat as text. Or I'd key off the charset
being being provided as th
I've always found that IdentifyMimeType returns a wide, wide range of
values for mime.type. There is often ambiguity that mime.type is a reliable
indicator of the nature of the content. To illustrate, I've passed file.txt
into Nifi that contains a string representation of json. I'd expect this to
b
Jim,
1) The line
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
is for reading the whole content in as a UTF-8 encoded string. The
inputStream itself deals in bytes [1] , so you could use available()
and read() to get the binary data.
2) If you are not dealing in strings/text, you
"How can discern binary or character content using conditional checks
to be sure I handle the file properly?"
Use NiFi and the existing processors where able and extend/script only
where necessary/critical. For the case you mention use
IdentifyMimeType and route appropriate data to the appropriat
Andy, regarding the the code sample you offered above - doesn't this put
into text both the attributes metadata and the payload of the flowfile?
If that is the case, how does one modify that to read in from the stream
into variable text only the file payload?
On Fri, Nov 3, 2017 at 5:48 AM, James
Thank you Andy. I'd like to ask just a few quick follow up questions.
1- My flow content may be textual characters, and it can also be binary -
jpgs, pngs, and similar. How can discern binary or character content using
conditional checks to be sure I handle the file properly? How would I alter
thi
James,
The Python API should be the same as the Java FlowFile.java interface [1]. Matt
Burgess’ blog has a good post about using Jython to do flowfile content
manipulation. Something like:
flowFile = session.get()
if (flowFile != None):
flowFile = session.write(flowFile,PyStreamCallback())
In python, I can use the requests library to post content something like
htis:
import requests
url="https://abc.test.org";
files={'file':open('/somedir/myfile.txt','rb')}
r = requests.post(url,files=files)
If I am in a python stream callback, how can I read the flowfile payload in
the same way th
10 matches
Mail list logo