Tom,

Could you use logattribute processor and somehow log the value of your 
“date_final” variables?

Tested your code with Jpython, with input string “181231” it works as expected 
(the result is 1st week of 2019).

From: Tomislav Novosel <to.novo...@gmail.com>
Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
Date: Wednesday, 30 January 2019 at 11:10
To: "users@nifi.apache.org" <users@nifi.apache.org>
Subject: Re: Modify Flowfile attributes

Yes, the values are correct. Attribute has value which is expected to be.
i.e. for date 181231 in filename I get value 18231 for attribute week_extracted 
which is extracted from filename with split method.

Tom.

On Wed, 30 Jan 2019 at 10:59, Arpad Boda 
<ab...@hortonworks.com<mailto:ab...@hortonworks.com>> wrote:
Hi Tom,

“that is exactly what I tried and date_final or date_file are applied to the 
attribute of outgoing flowfile, it works.”

It works as they are strings, so not working would be a surprise. The question 
is: what are their values? 😊

Regards,
Arpad

From: Tomislav Novosel <to.novo...@gmail.com<mailto:to.novo...@gmail.com>>
Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Date: Wednesday, 30 January 2019 at 10:53
To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Subject: Re: Modify Flowfile attributes

Hi Arpad,

that is exactly what I tried and date_final or date_file are applied to the 
attribute of outgoing flowfile, it works.
But if I put to attribute week_att, there is error: week_att cannot be coerced 
as String, and if I put str_week it gives me week number 44.

Tom

On Wed, 30 Jan 2019 at 08:40, Arpad Boda 
<ab...@hortonworks.com<mailto:ab...@hortonworks.com>> wrote:
Tom,

The Python code to get the week number for a datetime string seems to be 
correct.

To help debugging could you stamp your “date_final” or “date_file” variable to 
an attribute, so we could see what’s the input?
My gut feeling says there is some parsing magic going wrong here.

Regards,
Arpad

From: Tomislav Novosel <to.novo...@gmail.com<mailto:to.novo...@gmail.com>>
Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Date: Tuesday, 29 January 2019 at 20:13
To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Subject: Re: Modify Flowfile attributes

With following script I get week number 44 and year 118, which is strange 
result.
Week should be 1 and year 2019 for date 2018-31-12.
What is wrong here?

Tom

from datetime import datetime, timedelta, date

flowFile = session.get()
if (flowFile != None):
    file_name = flowFile.getAttribute('filename')

    date_file = file_name.split("_")[6]
    date_final = date_file.split(".")[0]
    date_obj = datetime.strptime(date_final,'%y%m%d')
    date_year = date_obj.year
    date_day = date_obj.day
    date_month = date_obj.month

    week_att = date(year=date_year, month=date_month, 
day=date_day).isocalendar()[1]
    year_att = date(year=date_year, month=date_month, 
day=date_day).isocalendar()[0]
    str_week = str(week_att)
    str_year = str(year_att)

    flowFile = session.putAttribute(flowFile, "year_extracted", str_year)
    flowFile = session.putAttribute(flowFile, "week_extracted", str_week)
    session.transfer(flowFile, REL_SUCCESS)
    session.commit()

On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel 
<to.novo...@gmail.com<mailto:to.novo...@gmail.com>> wrote:
Thank you all for answers. The reason why I want this to do with python script 
is wrong calculation of week number from date. Nifi has that function in 
expression lang. (extracted_date:format("w", <<time_zone>>)). My time zone is 
GMT+2.
If i set date, for example 20180819, and time zone in function GMT I get week 
number 34, which is wrong. If I ommit time zone, I get week number 33, which is 
right. I'm not sure if thats bug. You can test it for yourself, and if you do, 
please share your findings here, maybe I'm doing something wrong.

On the other side, if I use python, I'more sure that I will get correct week 
number, even for dates which overlaps with week number in next year(e.g. 
20181231)

Since this calc will be in production, I need resilient workflow in the future 
without errors.

Regarding script I sent above, I'm getting error: "week cannot bo coerced as 
string". I checked right on the beginning if the session is null or not.

On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov 
<grapesmo...@gmail.com<mailto:grapesmo...@gmail.com> wrote:
I wanted to add, since I've done this specific operation many times, that you 
can really just do this via the NiFi expression language, which I think is more 
"idiomatic" than having ExecuteScript processors all over the place. Basically, 
you would have an UpdateAttribute that set something called, say, 
date_extracted with an expression that looks something like 
${filename:substringAfterLast('_'):toDate('yyyy.MM.dd')} (this is an 
approximation based on the above, modify as necessary for your purpose). Then 
you could use a second UpdateAttribute to extract various information from this 
date with the format command, e.g. ${date_extracted:format('<your format 
expression here>')}. I don't think there's one for "week" but in general this 
is the approach I take when I need to do date munging.

On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel 
<to.novo...@gmail.com<mailto:to.novo...@gmail.com>> wrote:
Hi Matt, thanks for suggestions. But performance is not crucial here.
This is code i tried. but I get error: "AttributeError: 'NoneType' object has 
no attribute 'getAttribute' at line number 4"
If I remove code from line 6 to line 14, it works with some default attribute 
values for year_extracted and week_extracted, otherwise i get
error form above.

Tom

from datetime import datetime, timedelta, date

flowFile = session.get()
file_name = flowFile.getAttribute('filename')

date_file = file_name.split("_")[6]
date_final = date_file.split(".")[0]
date_obj = datetime.strptime(date_final,'%y%m%d')
date_year = date_obj.year
date_day = date_obj.day
date_month = date_obj.month

week = date(year=date_year, month=date_month, day=date_day).isocalendar()[1]
year = date(year=date_year, month=date_month, day=date_day).isocalendar()[0]

if (flowFile != None):
flowFile = session.putAttribute(flowFile, "year_extracted", year)
flowFile = session.putAttribute(flowFile, "week_extracted", week)
session.transfer(flowFile, REL_SUCCESS)
session.commit()

On Tue, 29 Jan 2019 at 15:53, Matt Burgess 
<mattyb...@apache.org<mailto:mattyb...@apache.org>> wrote:
Tom,

Keep in mind that you are using Jython not Python, which I mention
only to point out that it is *much* slower than the native Java
processors such as UpdateAttribute, and slower than other scripting
engines such as Groovy or Javascript/Nashorn.

If performance/throughput is not a concern and you're more comfortable
with Jython, then Jerry's suggestion of session.putAttribute(flowFile,
attributeName, attributeValue) should do the trick. Note that if you
are adding more than a couple attributes, it's probably better to
create a dictionary (eventually/actually, a Java Map<String,String>)
of attribute name/value pairs, and use putAllAttributes(flowFile,
attributes) instead, as it is more performant.

Regards,
Matt

On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel 
<to.novo...@gmail.com<mailto:to.novo...@gmail.com>> wrote:
>
> Thanks for the answer.
>
> Yes I know I can handle that with Expression language and UpdateAttribute 
> processor, but this is specific case on my work and I think Python
> is better and more simple solution. I need to calc that with python script.
>
> Tom
>
> On Tue, 29 Jan 2019 at 15:18, John McGinn 
> <amruginn-n...@yahoo.com<mailto:amruginn-n...@yahoo.com>> wrote:
>>
>> Since you're script shows that "filename" is an attribute of your flowfile, 
>> you could use the UpdateAttribute processor.
>>
>> If you right click on UpdateAttribute and choose ShowUsage, then choose 
>> Expression Language Guide, it shows you the things you can handle.
>>
>> Something along the lines of ${filename:getDelimitedField(6,'_')}, if I 
>> understand the Groovy code correctly. I did a GenerateFlowFIle to an 
>> UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then sent 
>> that to another UpdateAttribute with the getDelimitedField() I listed and I 
>> received 6.2. Then another UpdateAttribute could parse the 6.2 for the 
>> second substring, or you might be able to chain them in the existing 
>> UpdateProcessor.
>>
>>
>> --------------------------------------------
>> On Tue, 1/29/19, Tomislav Novosel 
>> <to.novo...@gmail.com<mailto:to.novo...@gmail.com>> wrote:
>>
>>  Subject: Modify Flowfile attributes
>>  To: users@nifi.apache.org<mailto:users@nifi.apache.org>
>>  Date: Tuesday, January 29, 2019, 9:04 AM
>>
>>  Hi all,
>>  I'm trying to calculate week number and date
>>  from filename using ExecuteScript processor and Jython. Here
>>  is python script.How can I add calculated
>>  attributes week and year to flowfile?
>>  Please help, thank you.Tom
>>  P.S. Maybe I completely missed with this script.
>>  Feel free to correct me.
>>
>>  import
>>  jsonimport java.iofrom org.apache.commons.io<http://org.apache.commons.io> 
>> import
>>  IOUtilsfrom java.nio.charset import
>>  StandardCharsetsfrom 
>> org.apache.nifi.processor.io<http://org.apache.nifi.processor.io> import
>>  StreamCallbackfrom datetime import datetime, timedelta, date
>>  class PyStreamCallback(StreamCallback):
>>  def __init__(self, flowfile):
>>  self.ff = flowfile
>>         pass
>>  def process(self, inputStream, outputStream):
>>  file_name =
>>  self.ff.getAttribute("filename")
>>  date_file =
>>  file_name.split("_")[6]
>>  date_final =
>>  date_file.split(".")[0]
>>  date_obj =
>>  datetime.strptime(date_final,'%y%m%d')
>>  date_year =
>>  date_obj.year
>>    date_day =
>>  date_obj.day
>>   date_month =
>>  date_obj.month
>>          week = date(year=date_year, month=date_month, 
>> day=date_day).isocalendar()[1]
>>  year =
>>  date(year=date_year, month=date_month, day=date_day).isocalendar()[0]
>>  flowFile =
>>  session.get()if (flowFile != None):
>>  session.transfer(flowFile, REL_SUCCESS)
>>  session.commit()


--
http://www.google.com/profiles/grapesmoker

Reply via email to