Re: Bug/Issue with ReplaceTextWithMapping

2019-07-31 Thread Ameer Mawia
That make sense. Thanks Koji for prompt replies. Appreciate it.

Thankyou.

On Tue, Jul 30, 2019 at 6:20 AM Koji Kawamura 
wrote:

> The tryLock method does not block if a lock is already acquired by other
> thead.
>
> https://github.com/apache/nifi/blob/f8e93186f53917b1fddbc2ae3de26b65a99b9246/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java#L239
>
> On Mon, Jul 29, 2019, 23:24 Ameer Mawia  wrote:
>
>> Adding reference link
>> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>(to
>> the code).
>>
>> On Mon, Jul 29, 2019 at 10:21 AM Ameer Mawia 
>> wrote:
>>
>>> Thanks for reply.
>>>
>>> Hmm, that should explain the behavior we noted.
>>>
>>> But I see(here
>>> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>)
>>> an instance level lock which is protecting the update Mapping method. 
>>> *Shouldn't
>>> that eventually block other threads from accessing the old mapping?*
>>>
>>> Or may that this locking was added later -  version 1.9 or something? We
>>> are using 1.8.
>>>
>>> Thanks,
>>> Ameer Mawia
>>>
>>> On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura 
>>> wrote:
>>>
>>>> Hi Ameer,
>>>>
>>>> Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
>>>> Since ReplaceTextWithMapping only reload at a single thread, other
>>>> threads may use old mapping until the loading thread complete
>>>> refreshing mapping definition.
>>>>
>>>> Thanks,
>>>> Koji
>>>>
>>>> On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia 
>>>> wrote:
>>>> >
>>>> > Inline.
>>>> >
>>>> > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura 
>>>> wrote:
>>>> >>
>>>> >> Hi Ameer,
>>>> >>
>>>> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval'
>>>> configured?
>>>> >
>>>> > [Ameer] It is configured to 1sec - the lowest value allowed.
>>>> >>
>>>> >> By default, it's set to '60s'. So,
>>>> >> 1. If ReplaceTextWithMapping ran with the old mapping file
>>>> >
>>>> > [Ameer] First Processing took place on Day-1. A new Mapping was
>>>> dropped on Day-1, after Day-1 Processing was over.
>>>> >>
>>>> >> 2. and the mapping file was updated for the next processing
>>>> >
>>>> > [Ameer] Second Processing took place on Day-2.
>>>> > [Ameer] Here assumption was CACHE will be refreshed from the new
>>>> mapping file dropped a day earlier. But ti diddnt happend. Cache got
>>>> refreshed in the middle of the flow - not at the very beginnning. Thus few
>>>> flowfile got old value and later flowfile got new value.
>>>> >>
>>>> >> 3. then the flow started processing another CSV file right away line
>>>> by line
>>>> >>
>>>> >> In above scenario, some lines in the CSV might get processed with the
>>>> >> old mapping file. After 60s passed from 1, some other lines may get
>>>> >> processed with the new mappings. Is that what you're seeing?
>>>> >>
>>>> > [Ameer] This is what is happening. But it shouldn't have - becuase
>>>> new mapping file was already existing before the next processing begin. It
>>>> should have refresh right at the start - as also suggested by the code of
>>>> the ReplaceTextWithMapping processor.
>>>> >>
>>>> >> BTW, please avoid posting the same question to users and dev at the
>>>> >> same time. I've removed dev address.
>>>> >> [Ameer] Got it.
>>>> >> Thanks,
>>>> >> Koji
>>>> >>
>>>> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia 
>>>> wrote:
>>>> >> >
>>>> >> > Correcting Typo.
>>>> >> >
>>>> >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia 
>>>> wrote:
&

Re: Bug/Issue with ReplaceTextWithMapping

2019-07-29 Thread Ameer Mawia
Thanks for reply.

Hmm, that should explain the behavior we noted.

But I see an instance level lock which is protecting the update Mapping
method. *Shouldn't that eventually block other threads from accessing the
old mapping?*

Or may that this locking was added later -  version 1.9 or something? We
are using 1.8.

Thanks,
Ameer Mawia

On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura 
wrote:

> Hi Ameer,
>
> Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
> Since ReplaceTextWithMapping only reload at a single thread, other
> threads may use old mapping until the loading thread complete
> refreshing mapping definition.
>
> Thanks,
> Koji
>
> On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia  wrote:
> >
> > Inline.
> >
> > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura 
> wrote:
> >>
> >> Hi Ameer,
> >>
> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval'
> configured?
> >
> > [Ameer] It is configured to 1sec - the lowest value allowed.
> >>
> >> By default, it's set to '60s'. So,
> >> 1. If ReplaceTextWithMapping ran with the old mapping file
> >
> > [Ameer] First Processing took place on Day-1. A new Mapping was dropped
> on Day-1, after Day-1 Processing was over.
> >>
> >> 2. and the mapping file was updated for the next processing
> >
> > [Ameer] Second Processing took place on Day-2.
> > [Ameer] Here assumption was CACHE will be refreshed from the new mapping
> file dropped a day earlier. But ti diddnt happend. Cache got refreshed in
> the middle of the flow - not at the very beginnning. Thus few flowfile got
> old value and later flowfile got new value.
> >>
> >> 3. then the flow started processing another CSV file right away line by
> line
> >>
> >> In above scenario, some lines in the CSV might get processed with the
> >> old mapping file. After 60s passed from 1, some other lines may get
> >> processed with the new mappings. Is that what you're seeing?
> >>
> > [Ameer] This is what is happening. But it shouldn't have - becuase new
> mapping file was already existing before the next processing begin. It
> should have refresh right at the start - as also suggested by the code of
> the ReplaceTextWithMapping processor.
> >>
> >> BTW, please avoid posting the same question to users and dev at the
> >> same time. I've removed dev address.
> >> [Ameer] Got it.
> >> Thanks,
> >> Koji
> >>
> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia 
> wrote:
> >> >
> >> > Correcting Typo.
> >> >
> >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia 
> wrote:
> >> >>
> >> >> Guys,
> >> >>
> >> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG
> with Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
> >> >>
> >> >> Our USAGE Scenario:
> >> >>
> >> >> We use NIFI primarily as a TRANSFORMATION Tool.
> >> >> Our flow involves:
> >> >>
> >> >> Getting a raw csv file.
> >> >> Split the file on per line basis:
> >> >>
> >> >> So from one source flowfile - we may have 1 flowfile
> generated/splitted out.
> >> >>
> >> >> For each of the splitted flow file(flowfiles for individual lines)
> we perform transformation on the attributes.
> >> >> We merge these flowfiles back and write the Output file.
> >> >>
> >> >>
> >> >> As part of the transformation in Step#3, we do some mapping for one
> of the field in the csv. For this we use ReplaceTextWithMapping  Processor.
> Also to note we update our mapping file just before starting our flow(ie.
> Step #1)
> >> >>
> >> >> Our Issue:
> >> >>
> >> >> We have noted for SAME key we get two DIFFERENT values in two
> different flowfiles.
> >> >> We noted that one of the value mapped, existed in an older Mapping
> file.
> >> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its
> cash uptill certain time. And thus return the old value for few mapping
> file and then - once in the meanwhile it has refreshed it cache - returned
> new updated value.
> >> >> And this cause the issue?
> >> >>
> >> >> Question:
> >> >>
> >> >> Is this a known issue with  ReplaceTextWithMapping Processor?
> >> >> If not how can I create an issue for this?
> >> >> How can I confirm this behaviour?
> >> >>
> >> >> Thanks,
> >> >> Ameer Mawia
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> http://ca.linkedin.com/in/ameermawia
> >> >> Toronto, ON
> >> >>
> >> >
> >> >
> >> > --
> >> > http://ca.linkedin.com/in/ameermawia
> >> > Toronto, ON
> >> >
> >
> >
> >
> > --
> > http://ca.linkedin.com/in/ameermawia
> > Toronto, ON
> >
>


-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON


Re: Bug/Issue with ReplaceTextWithMapping

2019-07-23 Thread Ameer Mawia
Inline.

On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura 
wrote:

> Hi Ameer,
>
> How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured?
>
[Ameer] It is configured to 1sec - the lowest value allowed.

> By default, it's set to '60s'. So,
> 1. If ReplaceTextWithMapping ran with the old mapping file
>
[Ameer] First Processing took place on Day-1. A new Mapping was dropped on
Day-1, after Day-1 Processing was over.

> 2. and the mapping file was updated for the next processing
>
[Ameer] Second Processing took place on Day-2.
[Ameer] Here assumption was CACHE will be refreshed from the new mapping
file dropped a day earlier. But ti diddnt happend. Cache got refreshed in
the middle of the flow - not at the very beginnning. Thus few flowfile got
old value and later flowfile got new value.

> 3. then the flow started processing another CSV file right away line by
> line
>
> In above scenario, some lines in the CSV might get processed with the
> old mapping file. After 60s passed from 1, some other lines may get
> processed with the new mappings. Is that what you're seeing?
>
> [Ameer] This is what is happening. But it shouldn't have - becuase new
mapping file was already existing before the next processing begin. It
should have refresh right at the start - as also suggested by the code of
the ReplaceTextWithMapping processor.

> BTW, please avoid posting the same question to users and dev at the
> same time. I've removed dev address.
> [Ameer] Got it.
> Thanks,
> Koji
>
> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia  wrote:
> >
> > Correcting Typo.
> >
> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia 
> wrote:
> >>
> >> Guys,
> >>
> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
> Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
> >>
> >> Our USAGE Scenario:
> >>
> >> We use NIFI primarily as a TRANSFORMATION Tool.
> >> Our flow involves:
> >>
> >> Getting a raw csv file.
> >> Split the file on per line basis:
> >>
> >> So from one source flowfile - we may have 1 flowfile
> generated/splitted out.
> >>
> >> For each of the splitted flow file(flowfiles for individual lines) we
> perform transformation on the attributes.
> >> We merge these flowfiles back and write the Output file.
> >>
> >>
> >> As part of the transformation in Step#3, we do some mapping for one of
> the field in the csv. For this we use ReplaceTextWithMapping  Processor.
> Also to note we update our mapping file just before starting our flow(ie.
> Step #1)
> >>
> >> Our Issue:
> >>
> >> We have noted for SAME key we get two DIFFERENT values in two different
> flowfiles.
> >> We noted that one of the value mapped, existed in an older Mapping file.
> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its cash
> uptill certain time. And thus return the old value for few mapping file and
> then - once in the meanwhile it has refreshed it cache - returned new
> updated value.
> >> And this cause the issue?
> >>
> >> Question:
> >>
> >> Is this a known issue with  ReplaceTextWithMapping Processor?
> >> If not how can I create an issue for this?
> >> How can I confirm this behaviour?
> >>
> >> Thanks,
> >> Ameer Mawia
> >>
> >>
> >>
> >>
> >> --
> >> http://ca.linkedin.com/in/ameermawia
> >> Toronto, ON
> >>
> >
> >
> > --
> > http://ca.linkedin.com/in/ameermawia
> > Toronto, ON
> >
>


-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON


Re: Bug/Issue with ReplaceTextWithMapping

2019-07-19 Thread Ameer Mawia
Correcting Typo.

On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia  wrote:

> Guys,
>
> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
> Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
>
> Our USAGE Scenario:
>
>- We use NIFI primarily as a TRANSFORMATION Tool.
>- Our flow involves:
>
>
>1. Getting a raw csv file.
>   2. Split the file on per line basis:
>  1. So from one source flowfile - we may have 1 flowfile
>  generated/splitted out.
>   3. For each of the splitted flow file(flowfiles for individual
>   lines) we perform transformation on the attributes.
>   4. We merge these flowfiles back and write the Output file.
>
>
> As part of the transformation in Step#3, we do some mapping for one of the
> field in the csv. For this we use ReplaceTextWithMapping  Processor. Also
> to note we update our mapping file just before starting our flow(ie. Step
> #1)
>
> Our Issue:
>
>
>- We have noted for SAME key we get two DIFFERENT values in two
>different flowfiles.
>- We noted that one of the value mapped, existed in an older Mapping
>file.
>- So in essence: ReplaceTextWithMapping Processor didn't refresh its
>cash uptill certain time. And thus return the old value for few mapping
>file and then - once in the meanwhile it has refreshed it cache - returned
>new updated value.
>- And this cause the issue?
>
> Question:
>
>- Is this a known issue with  ReplaceTextWithMapping Processor?
>- If not how can I create an issue for this?
>- How can I confirm this behaviour?
>
> Thanks,
> Ameer Mawia
>
>
>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>
>

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON


Bug/Issue with ReplaceTextWithMapping

2019-07-19 Thread Ameer Mawia
Guys,

It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
Refreshing its Mapped file. We are using its functionality in PROD and
getting odd behaviour.

Our USAGE Scenario:

   - We use NIFI primarily as a TRANSFORMATION Tool.
   - Our flow involves:


   1. Getting a raw csv file.
  2. Split the file on per line basis:
 1. So from one source flow file - we may 1 flows
 generated/splitted out.
  3. For each of the splitted flow file(flowfiles for individual lines)
  we perform transformation on the attributes.
  4. We merge these flowfiles back and write the Output file.


As part of the transformation in Step#4, we do some mapping for one of the
field in the csv. For this we use ReplaceTextWithMapping  Processor. Also
to note we update ourmapping file just before starting our flow(ie. Step #1)

Our Issue:


   - We have noted for SAME key we get two DIFFERENT values in two
   different flowfile.
   - We noted: that one the value mapped existed in older Mapping file.
   - So in essence: ReplaceTextWithMapping Processor didn't refresh its
   cash uptil certain time. And thus return old value for few mapping file and
   then - once in the meanwhile it has refreshed it cache - returned new
   mapped value.
   - So this cause the issue.

Question:

   - Is this a known issue with  ReplaceTextWithMapping Processor?
   - If not how can I create an issue for this?
   - How can I confirm this behaviour?

Thanks,
Ameer Mawia




-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON


Re: NIFI Usage for Data Transformation

2018-11-01 Thread Ameer Mawia
Inline.

On Thu, Nov 1, 2018 at 1:40 PM Bryan Bende  wrote:

> How big are the initial CSV files?
>
> If they are large, like millions of lines, or even hundreds of
> thousands, then it will be ideal if you can avoid the line-by-line
> split, and instead process the lines in place.
>
> Not million. But definitely ranging from 10s to 100s of thousand.


> This is one of the benefits of the record processors. For example,
> with UpdateRecord you can read in a large CSV line by line, apply an
> update to each line, and write it back out. So you only ever have one
> flow file.
>
> Agreed.


> It sounds like you may have a significant amount of custom logic so
> you may need a custom processor,

Yes. Each record has its own logic. On top of that some time multiple data
source are referred to determine the final value of the output field.

> but you can still take this approach
> of reading a single flow file line by line, and writie out the results
> line by line (try to avoid reading the entire content into memory at
> one time).
>
That what I am trying.


> On Thu, Nov 1, 2018 at 1:22 PM Ameer Mawia  wrote:
> >
> > Thanks for the input folks.
> >
> > I had this impression that for actual processing of the data :
> >
> > we may have to put in place a custom processor which will have the
> transformation framework logic in it.
> > Or we can use ExcecuteProcess processor to trigger an external
> process(which will be this transformation logic) and route back the output
> in the NIFI.
> >
> > Our flow inside the framework generally looks like this:
> >
> > Split the CSV file line by line.
> > For each line Split it in array of string.
> > For each record in the array determine its invoke it transformation
> method.
> > Transformation Method contains the transformation logic. This logic can
> be pretty intensive like:
> >
> > searching for hundreds of different pattern.
> > lookup against hundreds of configured string constants.
> > Appending/Prepending/Trimming/Padding...
> >
> > Finally map the each record into an output csv format.
> >
> > So far we have been trying to see if SplitRecord, UpdateRecord,
> ExtractText, etc can come in handy?
> >
> > Thanks,
> >
> > On Thu, Nov 1, 2018 at 12:39 PM Mike Thomsen 
> wrote:
> >>
> >> Ameer,
> >>
> >> Depending on how you implemented the custom framework, you may be able
> to easily drop it in place into a custom NiFi processor. Without knowing
> much about your implementation details, if you can act on Java streams,
> Strings, byte arrays and things like that it will probably be very straight
> forward to drop in place.
> >>
> >> This is a really simple of how you could bring it in depending on how
> encapsulated your business logic is:
> >>
> >> @Override
> >> public void onTrigger(ProcessContext context, ProcessSession session)
> throws ProcessException {
> >> FlowFile input = session.get();
> >> if (input == null) {
> >> return;
> >> }
> >>
> >> FlowFile output = session.create(input);
> >> try (InputStream is = session.read(input);
> >> OutputStream os = session.write(output)
> >> ) {
> >> transformerPojo.transform(is, os);
> >>
> >> is.close();
> >> os.close();
> >>
> >> session.transfer(input, REL_ORIGINAL); //If you created an
> "original relationship"
> >> session.transfer(output, REL_SUCCESS);
> >> } catch (Exception ex) {
> >> session.remove(output);
> >> session.transfer(input, REL_FAILURE);
> >> }
> >> }
> >>
> >> That's the general idea, and that approach can scale to your disk space
> limits. Hope that helps put it into perspective.
> >>
> >> Mike
> >>
> >> On Thu, Nov 1, 2018 at 10:16 AM Nathan Gough 
> wrote:
> >>>
> >>> Hi Ameer,
> >>>
> >>> This blog by Mark Payne describes how to manipulate record based data
> like CSV using schemas:
> https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi. This
> would probably be the most efficient method. And another here:
> https://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
> .
> >>>
> >>> An alternative option would be to port your custom java code into your
> own NiFi processor:
> >>>
> https://medium.com/hashmapinc/creating-custom-processors-and-controllers-in-apac

Re: NIFI Usage for Data Transformation

2018-11-01 Thread Ameer Mawia
Thanks for the input folks.

I had this impression that for actual processing of the data :

   - we may have to put in place a custom processor which will have the
   transformation framework logic in it.
   - Or we can use ExcecuteProcess processor to trigger an external
   process(which will be this transformation logic) and route back the output
   in the NIFI.

Our flow inside the framework generally looks like this:


   - Split the CSV file line by line.
   - For each line Split it in array of string.
   - For each record in the array determine its invoke it transformation
   method.
   - Transformation Method contains the transformation logic. This logic
   can be pretty intensive like:
  - searching for hundreds of different pattern.
  - lookup against hundreds of configured string constants.
  - Appending/Prepending/Trimming/Padding...
   - Finally map the each record into an output csv format.

So far we have been trying to see if SplitRecord, UpdateRecord,
ExtractText, etc can come in handy?

Thanks,

On Thu, Nov 1, 2018 at 12:39 PM Mike Thomsen  wrote:

> Ameer,
>
> Depending on how you implemented the custom framework, you may be able to
> easily drop it in place into a custom NiFi processor. Without knowing much
> about your implementation details, if you can act on Java streams, Strings,
> byte arrays and things like that it will probably be very straight forward
> to drop in place.
>
> This is a really simple of how you could bring it in depending on how
> encapsulated your business logic is:
>
> @Override
> public void onTrigger(ProcessContext context, ProcessSession session)
> throws ProcessException {
> FlowFile input = session.get();
> if (input == null) {
> return;
> }
>
> FlowFile output = session.create(input);
> try (InputStream is = session.read(input);
> OutputStream os = session.write(output)
> ) {
> transformerPojo.transform(is, os);
>
> is.close();
> os.close();
>
> session.transfer(input, REL_ORIGINAL); //If you created an
> "original relationship"
> session.transfer(output, REL_SUCCESS);
> } catch (Exception ex) {
> session.remove(output);
> session.transfer(input, REL_FAILURE);
> }
> }
>
> That's the general idea, and that approach can scale to your disk space
> limits. Hope that helps put it into perspective.
>
> Mike
>
> On Thu, Nov 1, 2018 at 10:16 AM Nathan Gough  wrote:
>
>> Hi Ameer,
>>
>> This blog by Mark Payne describes how to manipulate record based data
>> like CSV using schemas:
>> https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi. This
>> would probably be the most efficient method. And another here:
>> https://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
>> .
>>
>> An alternative option would be to port your custom java code into your
>> own NiFi processor:
>>
>> https://medium.com/hashmapinc/creating-custom-processors-and-controllers-in-apache-nifi-e14148740ea
>> under 'Steps for Creating a Custom Apache NiFi Processor'
>> https://nifi.apache.org/developer-guide.html
>>
>> Nathan
>>
>> On 10/31/18, 5:02 PM, "Ameer Mawia"  wrote:
>>
>> We have a use case where we take data from a source(text data in csv
>> format), do transformation and manipulation of textual record, and
>> output
>> the data in another (csv)format. This is being done by a Java based
>> custom
>> framework, written specifically for this *transformation* piece.
>>
>> Recently as Apache NIFI is being adopted at enterprise level by the
>> organisation, we have been asked to try *Apache NIFI* and see if can
>> use
>> that as a replacement to this custom tool?
>>
>> *My question is*:
>>
>>- How much leverage does *Apache NIFI *provides on the flowfile
>> *content
>>*manipulation?
>>
>> I understand *NIFI *is good for creating data flow pipeline, but is
>> it good
>> for *extensive TEXT Transformation* as well?   So far I have not found
>> obvious way to achieve that.
>>
>> Appreciate the feedback.
>>
>> Thanks,
>>
>> --
>> http://ca.linkedin.com/in/ameermawia
>> Toronto, ON
>>
>>
>>
>>

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON


Fwd: NIFI Usage for Data Transformation

2018-10-31 Thread Ameer Mawia
We have a use case where we take data from a source(text data in csv
format), do transformation and manipulation of textual record, and output
the data in another (csv)format. This is being done by a Java based custom
framework, written specifically for this *transformation* piece.

Recently as Apache NIFI is being adopted at enterprise level by the
organisation, we have been asked to try *Apache NIFI* and see if can use
that as a replacement to this custom tool?

*My question is*:

   - How much leverage does *Apache NIFI *provides on the flowfile *content
   *manipulation?

I understand *NIFI *is good for creating data flow pipeline, but is it good
for *extensive TEXT Transformation* as well?   So far I have not found
obvious way to achieve that.

Appreciate the feedback.

Thanks,

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON



-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON