I wanted to add, since I've done this specific operation many times, that
you can really just do this via the NiFi expression language, which I think
is more "idiomatic" than having ExecuteScript processors all over the
place. Basically, you would have an UpdateAttribute that set something
called, say, date_extracted with an expression that looks something like
${filename:substringAfterLast('_'):toDate('yyyy.MM.dd')} (this is an
approximation based on the above, modify as necessary for your purpose).
Then you could use a second UpdateAttribute to extract various information
from this date with the format command, e.g. ${date_extracted:format('<your
format expression here>')}. I don't think there's one for "week" but in
general this is the approach I take when I need to do date munging.

On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel <to.novo...@gmail.com>
wrote:

> Hi Matt, thanks for suggestions. But performance is not crucial here.
> This is code i tried. but I get error: "AttributeError: 'NoneType' object
> has no attribute 'getAttribute' at line number 4"
> If I remove code from line 6 to line 14, it works with some default
> attribute values for year_extracted and week_extracted, otherwise i get
> error form above.
>
> Tom
>
> from datetime import datetime, timedelta, date
>
> flowFile = session.get()
> file_name = flowFile.getAttribute('filename')
>
> date_file = file_name.split("_")[6]
> date_final = date_file.split(".")[0]
> date_obj = datetime.strptime(date_final,'%y%m%d')
> date_year = date_obj.year
> date_day = date_obj.day
> date_month = date_obj.month
>
> week = date(year=date_year, month=date_month, day=date_day).isocalendar()[
> 1]
> year = date(year=date_year, month=date_month, day=date_day).isocalendar()[
> 0]
>
> if (flowFile != None):
> flowFile = session.putAttribute(flowFile, "year_extracted", year)
> flowFile = session.putAttribute(flowFile, "week_extracted", week)
> session.transfer(flowFile, REL_SUCCESS)
> session.commit()
>
> On Tue, 29 Jan 2019 at 15:53, Matt Burgess <mattyb...@apache.org> wrote:
>
>> Tom,
>>
>> Keep in mind that you are using Jython not Python, which I mention
>> only to point out that it is *much* slower than the native Java
>> processors such as UpdateAttribute, and slower than other scripting
>> engines such as Groovy or Javascript/Nashorn.
>>
>> If performance/throughput is not a concern and you're more comfortable
>> with Jython, then Jerry's suggestion of session.putAttribute(flowFile,
>> attributeName, attributeValue) should do the trick. Note that if you
>> are adding more than a couple attributes, it's probably better to
>> create a dictionary (eventually/actually, a Java Map<String,String>)
>> of attribute name/value pairs, and use putAllAttributes(flowFile,
>> attributes) instead, as it is more performant.
>>
>> Regards,
>> Matt
>>
>> On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel <to.novo...@gmail.com>
>> wrote:
>> >
>> > Thanks for the answer.
>> >
>> > Yes I know I can handle that with Expression language and
>> UpdateAttribute processor, but this is specific case on my work and I think
>> Python
>> > is better and more simple solution. I need to calc that with python
>> script.
>> >
>> > Tom
>> >
>> > On Tue, 29 Jan 2019 at 15:18, John McGinn <amruginn-n...@yahoo.com>
>> wrote:
>> >>
>> >> Since you're script shows that "filename" is an attribute of your
>> flowfile, you could use the UpdateAttribute processor.
>> >>
>> >> If you right click on UpdateAttribute and choose ShowUsage, then
>> choose Expression Language Guide, it shows you the things you can handle.
>> >>
>> >> Something along the lines of ${filename:getDelimitedField(6,'_')}, if
>> I understand the Groovy code correctly. I did a GenerateFlowFIle to an
>> UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then
>> sent that to another UpdateAttribute with the getDelimitedField() I listed
>> and I received 6.2. Then another UpdateAttribute could parse the 6.2 for
>> the second substring, or you might be able to chain them in the existing
>> UpdateProcessor.
>> >>
>> >>
>> >> --------------------------------------------
>> >> On Tue, 1/29/19, Tomislav Novosel <to.novo...@gmail.com> wrote:
>> >>
>> >>  Subject: Modify Flowfile attributes
>> >>  To: users@nifi.apache.org
>> >>  Date: Tuesday, January 29, 2019, 9:04 AM
>> >>
>> >>  Hi all,
>> >>  I'm trying to calculate week number and date
>> >>  from filename using ExecuteScript processor and Jython. Here
>> >>  is python script.How can I add calculated
>> >>  attributes week and year to flowfile?
>> >>  Please help, thank you.Tom
>> >>  P.S. Maybe I completely missed with this script.
>> >>  Feel free to correct me.
>> >>
>> >>  import
>> >>  jsonimport java.iofrom org.apache.commons.io import
>> >>  IOUtilsfrom java.nio.charset import
>> >>  StandardCharsetsfrom org.apache.nifi.processor.io import
>> >>  StreamCallbackfrom datetime import datetime, timedelta, date
>> >>  class PyStreamCallback(StreamCallback):
>> >>  def __init__(self, flowfile):
>> >>  self.ff = flowfile
>> >>         pass
>> >>  def process(self, inputStream, outputStream):
>> >>  file_name =
>> >>  self.ff.getAttribute("filename")
>> >>  date_file =
>> >>  file_name.split("_")[6]
>> >>  date_final =
>> >>  date_file.split(".")[0]
>> >>  date_obj =
>> >>  datetime.strptime(date_final,'%y%m%d')
>> >>  date_year =
>> >>  date_obj.year
>> >>    date_day =
>> >>  date_obj.day
>> >>   date_month =
>> >>  date_obj.month
>> >>          week = date(year=date_year, month=date_month,
>> day=date_day).isocalendar()[1]
>> >>  year =
>> >>  date(year=date_year, month=date_month, day=date_day).isocalendar()[0]
>> >>  flowFile =
>> >>  session.get()if (flowFile != None):
>> >>  session.transfer(flowFile, REL_SUCCESS)
>> >>  session.commit()
>>
>

-- 
http://www.google.com/profiles/grapesmoker

Reply via email to