Madhu,

The example from my blog post shows how to overwrite flow content, by first
reading in content from an input stream, then processing it and writing
back out to an output stream.  If for your example you just need to read
from the incoming flow file and add some attributes, you can use the
session.read() method instead of session.write(). In Jython the callback
might look something like this:

class PyReadStreamCallback(InputStreamCallback):
  def __init__(self):
        pass
  def process(self, inputStream):
    text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
    # Do your parsing here

Note the stream callback methods do not have a reference to the
ProcessSession, so you may want to create a dictionary for the attributes
to be added, and pass that into the PyReadStreamCallback constructor. Then
process() would add the attributes name/value pairs to the dictionary, and
after you call session.read() in the main script, you can add all the
attributes from the dictionary to the flow file.

The rest of the script will likely be similar to the blog post's script,
note there is no "outputStream" passed in (as PyReadStreamCallback is a
subclass of InputStreamCallback not StreamCallback), so there is no
"outputStream.write()" call in the process() method or anywhere else in the
script.

You may find another blog post helpful:
http://funnifi.blogspot.com/2016/02/executescript-explained-split-fields.html
 Although it uses Groovy as the language, it also explains some of the NiFi
Java API, at least the part that deals with reading/writing flow files,
immutable flow file references, etc.

Let me know if this works for you and/or if you have other questions or
issues.

Cheers,
Matt

On Thu, Mar 24, 2016 at 8:42 AM, Madhukar Thota <madhukar.th...@gmail.com>
wrote:

> Hi Matt,
>
> Do you have an example on how to use ExecuteScript on flowContent?
>
> I have the following url encoded string as flow content, where i would
> like use python parse it to get flow artibutes based on key values pairs.
>
>
> rt.start=navigation&rt.tstart=1458797018682&rt.bstart=1458797019033&rt.end=1458797019075&t_resp=21&t_page=372&t_done=393&t_other=t_domloaded%7C364&r=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&r2=&u=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&v=0.9&
> vis.st=visible
>
> -Madhu
>
> On Thu, Mar 24, 2016 at 12:34 AM, Madhukar Thota <madhukar.th...@gmail.com
> > wrote:
>
>> Hi Matt,
>>
>> Thank you for the input. I updated my config as you suggested and it
>> worked like charm and also big thankyou for nice article. i used your
>> article as reference when i am started Exploring ExecuteScript.
>>
>>
>> Thanks
>> Madhu
>>
>>
>>
>> On Thu, Mar 24, 2016 at 12:18 AM, Matt Burgess <mattyb...@gmail.com>
>> wrote:
>>
>>> Madhukar,
>>>
>>> Glad to hear you found a solution, I was just replying when your email
>>> came in.
>>>
>>> Although in ExecuteScript you have chosen "python" as the script engine,
>>> it is actually Jython that is being used to interpret the scripts, not your
>>> installed version of Python.  The first line (shebang) is ignored as it is
>>> a comment in Python/Jython.
>>>
>>> Modules installed with pip are not automatically available to the Jython
>>> engine, but if the modules are pure Python code (rather than native C /
>>> CPython), like user_agents is, you can import them one of two equivalent
>>> ways:
>>>
>>> 1) The way you have done, using sys.path.append.  I should mention that
>>> "import sys" is done for you so you can safely leave that out if you wish.
>>> 2) Add the path to the packages ('/usr/local/lib/python2.7/site-packages')
>>> to the Module Path property of the ExecuteScript processor. In this case
>>> the processor effectively does Option #1 for you.
>>>
>>> I was able to get your script to work but had to force the result of
>>> parse (a UserAgent object) into a string, so I wrapped it in str:
>>>
>>> str(parse(flowFile.getAttribute('http.headers.User-Agent')).browser)
>>>
>>> You're definitely on the right track :)  For another Jython example with
>>> ExecuteScript, check out this post on my blog:
>>> http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html
>>>
>>> I am new to Python as well, but am happy to help if I can with any
>>> issues you run into, as it will help me learn more as well :)
>>>
>>> Regards,
>>> Matt
>>>
>>>
>>> On Thu, Mar 24, 2016 at 12:10 AM, Madhukar Thota <
>>> madhukar.th...@gmail.com> wrote:
>>>
>>>> I was able to solve the python modules issues by adding the following
>>>> lines:
>>>>
>>>> import sys
>>>> sys.path.append('/usr/local/lib/python2.7/site-packages')  # Path where
>>>> my modules are installed.
>>>>
>>>> Now the issue i have is , how do i parse the incoming attributes using
>>>> this libarary correctly and get the new fields. I am kind of new to python
>>>> and also this my first attempt of using python with nifi.
>>>>
>>>> Any help is appreciated.
>>>>
>>>>
>>>>
>>>> On Wed, Mar 23, 2016 at 11:31 PM, Madhukar Thota <
>>>> madhukar.th...@gmail.com> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I am trying to use the following script to parse
>>>>> http.headers.useragent with python useragent module using ExecuteScript
>>>>> Processor.
>>>>>
>>>>> Script:
>>>>>
>>>>> #!/usr/bin/env python2.7
>>>>> from user_agents import parse
>>>>>
>>>>> flowFile = session.get()
>>>>> if (flowFile != None):
>>>>>   flowFile = session.putAttribute(flowFile, "browser",
>>>>> parse(flowFile.getAttribute('http.headers.User-Agent')).browser)
>>>>>   session.transfer(flowFile, REL_SUCCESS)
>>>>>
>>>>>
>>>>> But ExecuteProcessor, complaining about missing python module but
>>>>> modules are already installed using pip and tested outside nifi. How can i
>>>>> add or reference this modules to nifi?
>>>>>
>>>>> Error:
>>>>>
>>>>> 23:28:03 EDT
>>>>> ERROR
>>>>> af354413-9866-4557-808a-7f3a84353597
>>>>> ExecuteScript[id=af354413-9866-4557-808a-7f3a84353597] Failed to
>>>>> process session due to
>>>>> org.apache.nifi.processor.exception.ProcessException:
>>>>> javax.script.ScriptException: ImportError: No module named user_agents in
>>>>> <script> at line number 2:
>>>>> org.apache.nifi.processor.exception.ProcessException:
>>>>> javax.script.ScriptException: ImportError: No module named user_agents in
>>>>> <script> at line number 2
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to