Re: [EXT] ConvertCSVToAvro vs CSVReader - Value Delimiter

2017-09-25 Thread Arun Manivannan
Hi All,

Just raised a PR (https://github.com/apache/nifi/pull/2172) for JIRA
NIFI-4416 <https://issues.apache.org/jira/browse/NIFI-4416>

Appreciate your help, Peter and Matt.  Could you please have a quick look
and give your comments.

Joe - Could you also check out the JIRA and let me know if I've committed
some crime.

You guys are the best !

Best Regards,
Arun

On Mon, Sep 25, 2017 at 9:44 AM Arun Manivannan <a...@arunma.com> wrote:

> Thanks a lot, gentlemen. JIRA and PR coming through in a few hours.
>
> On Mon, Sep 25, 2017, 09:07 Matt Burgess <mattyb...@gmail.com> wrote:
>
>> Thanks all, if the PR is available tomorrow I can review as well and
>> merge, but I will be on vacation for a week after that. No pressure :)
>>
>> Regards,
>> Matt
>>
>> > On Sep 24, 2017, at 8:57 PM, Joe Witt <joe.w...@gmail.com> wrote:
>> >
>> > Thanks Arun and Peter.  Getting that resolved will be nice.  The
>> > performance difference of the record reader/writer approach in all
>> > this is pretty fantastic so the more we can do to iron out these sorts
>> > of edges the better.  Thanks!
>> >
>> >> On Sun, Sep 24, 2017 at 8:56 PM, Peter Wicks (pwicks) <
>> pwi...@micron.com> wrote:
>> >> Arun,
>> >>
>> >> I'm also using Ctrl+A as a delimiter and had the same problem.  I
>> haven't had time to write up a PR but it looked like a pretty easy fix to
>> me too.
>> >>
>> >> I can't merge the change if you submit it, but I'd be happy to review
>> it.
>> >>
>> >> --Peter
>> >>
>> >> -Original Message-
>> >> From: Arun Manivannan [mailto:a...@arunma.com]
>> >> Sent: Sunday, September 24, 2017 11:17 PM
>> >> To: Dev@nifi.apache.org
>> >> Subject: [EXT] ConvertCSVToAvro vs CSVReader - Value Delimiter
>> >>
>> >> Hi,
>> >>
>> >> The ConvertCSVToAvro processor have been having performance issues
>> while processing files which are more than a GB and I was suggested to use
>> the ConvertRecord that leverages the RecordReader and Writer. Did some
>> tests and they do perform well.
>> >>
>> >> Strangely, the CSVReader doesn't accept unicode character as the value
>> delimiter - Control A  (\u0001) character is the delimiter of my CSV.
>> >>
>> >> Did some analysis and I see that a minor change needs to be made on
>> the CSVUtils to unescape the delimiter, like what ConvertCSVToAvro does and
>> also modify the SingleCharacterValidator.
>> >>
>> >> Please let me know if you believe this isn't an issue and there's a
>> workaround for this. Else, I am more than happy to raise an issue and
>> submit a PR for review.
>> >>
>> >> Best Regards,
>> >> Arun
>>
>


Re: [EXT] ConvertCSVToAvro vs CSVReader - Value Delimiter

2017-09-24 Thread Arun Manivannan
Thanks a lot, gentlemen. JIRA and PR coming through in a few hours.

On Mon, Sep 25, 2017, 09:07 Matt Burgess <mattyb...@gmail.com> wrote:

> Thanks all, if the PR is available tomorrow I can review as well and
> merge, but I will be on vacation for a week after that. No pressure :)
>
> Regards,
> Matt
>
> > On Sep 24, 2017, at 8:57 PM, Joe Witt <joe.w...@gmail.com> wrote:
> >
> > Thanks Arun and Peter.  Getting that resolved will be nice.  The
> > performance difference of the record reader/writer approach in all
> > this is pretty fantastic so the more we can do to iron out these sorts
> > of edges the better.  Thanks!
> >
> >> On Sun, Sep 24, 2017 at 8:56 PM, Peter Wicks (pwicks) <
> pwi...@micron.com> wrote:
> >> Arun,
> >>
> >> I'm also using Ctrl+A as a delimiter and had the same problem.  I
> haven't had time to write up a PR but it looked like a pretty easy fix to
> me too.
> >>
> >> I can't merge the change if you submit it, but I'd be happy to review
> it.
> >>
> >> --Peter
> >>
> >> -Original Message-
> >> From: Arun Manivannan [mailto:a...@arunma.com]
> >> Sent: Sunday, September 24, 2017 11:17 PM
> >> To: Dev@nifi.apache.org
> >> Subject: [EXT] ConvertCSVToAvro vs CSVReader - Value Delimiter
> >>
> >> Hi,
> >>
> >> The ConvertCSVToAvro processor have been having performance issues
> while processing files which are more than a GB and I was suggested to use
> the ConvertRecord that leverages the RecordReader and Writer. Did some
> tests and they do perform well.
> >>
> >> Strangely, the CSVReader doesn't accept unicode character as the value
> delimiter - Control A  (\u0001) character is the delimiter of my CSV.
> >>
> >> Did some analysis and I see that a minor change needs to be made on the
> CSVUtils to unescape the delimiter, like what ConvertCSVToAvro does and
> also modify the SingleCharacterValidator.
> >>
> >> Please let me know if you believe this isn't an issue and there's a
> workaround for this. Else, I am more than happy to raise an issue and
> submit a PR for review.
> >>
> >> Best Regards,
> >> Arun
>


ConvertCSVToAvro vs CSVReader - Value Delimiter

2017-09-24 Thread Arun Manivannan
Hi,

The ConvertCSVToAvro processor have been having performance issues while
processing files which are more than a GB and I was suggested to use the
ConvertRecord that leverages the RecordReader and Writer. Did some tests
and they do perform well.

Strangely, the CSVReader doesn't accept unicode character as the value
delimiter - Control A  (\u0001) character is the delimiter of my CSV.

Did some analysis and I see that a minor change needs to be made on the
CSVUtils to unescape the delimiter, like what ConvertCSVToAvro does and
also modify the SingleCharacterValidator.

Please let me know if you believe this isn't an issue and there's a
workaround for this. Else, I am more than happy to raise an issue and
submit a PR for review.

Best Regards,
Arun


Re: NIFI-4198 - *ElasticsearchHttp processors do not expose Proxy settings

2017-08-13 Thread Arun Manivannan
Hi,

I notice that the AbstractElasticsearchHttpProcessor already has support
for the proxy host and port.  I suppose all I need to do now is to

1. Add the properties to the PropertyDescriptors list
2. Add support for proxy authentication
3. System test.

Can I get away with just this?

Cheers,
Arun



On Sun, Aug 13, 2017 at 9:06 AM Arun Manivannan <a...@arunma.com> wrote:

> Absolutely, I will.
>
> Is uploading the steps as a word document to the JIRA an appropriate way?
> I understand that these steps are specific to a set of JIRAs and the nifi
> dev guide may or may not be a good home for this. (I can look up how to do
> that if this is the case).
>
> Thanks again, Joe.
> Cheers
> Arun
>
> On Sun, Aug 13, 2017, 03:54 Joe Witt <joe.w...@gmail.com> wrote:
>
>> Thanks Arun.  If you can document the steps you went through the setup
>> a representative testing environment that would *greatly* help
>> whomever reviews to do the same/similar.  The review pipeline for
>> contributions like this can be tough and it is almost always due to
>> helping the reviewer setup the environment.
>>
>> Thanks
>>
>> On Sat, Aug 12, 2017 at 12:48 PM, Arun Manivannan <a...@arunma.com>
>> wrote:
>> > Great.  Thanks a lot, Joe.  Appreciate your help.
>> >
>> > Will get it up and running before I assign the issue to myself.
>> >
>> > Regards,
>> > Arun
>> >
>> > On Sun, Aug 13, 2017 at 3:24 AM Joe Witt <joe.w...@gmail.com> wrote:
>> >
>> >> Arun
>> >>
>> >> Very cool that you are planning to jump in on this.  Your approach
>> >> sounds like a good start.  As far as system testing you're hitting on
>> >> one of the more challenging parts of the equation here.  Your unit
>> >> tests of course won't integrate with a real ES instance but manual
>> >> testing can be done against a system as you mention.  I think a lot of
>> >> folks use things like Docker to do such testing or sometimes these
>> >> systems offer quick start configurations.  You could put a squid proxy
>> >> instance/container in front as well.  Perhaps this one can help
>> >> https://github.com/sameersbn/docker-squid
>> >>
>> >> Thanks
>> >> Joe
>> >>
>> >> On Sat, Aug 12, 2017 at 11:49 AM, Arun Manivannan <a...@arunma.com>
>> wrote:
>> >> > Hi,
>> >> >
>> >> > Very Good morning.
>> >> >
>> >> > I would like to make an attempt at resolving NIFI-4198
>> >> > <https://issues.apache.org/jira/browse/NIFI-4198>.  Looking at the
>> >> code, I
>> >> > would think by introducing the proxy url and authentication
>> properties
>> >> and
>> >> > delegating them to the OkHttpClient would be a good way to do it.
>> >> >
>> >> >
>> >>
>> https://stackoverflow.com/questions/35554380/okhttpclient-proxy-authentication-how-to
>> >> >
>> >> > The question I have is beside the test cases, I would like to test it
>> >> > against an actual elastic server to be absolutely sure of the fix.
>> >> What's
>> >> > the easiest/good way to run an elastic instance behind a proxy and
>> test
>> >> the
>> >> > modified component?
>> >> >
>> >> > Thanks in advance.
>> >> >
>> >> > Cheers,
>> >> > Arun
>> >>
>>
>


Re: NIFI-4198 - *ElasticsearchHttp processors do not expose Proxy settings

2017-08-12 Thread Arun Manivannan
Absolutely, I will.

Is uploading the steps as a word document to the JIRA an appropriate way? I
understand that these steps are specific to a set of JIRAs and the nifi dev
guide may or may not be a good home for this. (I can look up how to do that
if this is the case).

Thanks again, Joe.
Cheers
Arun

On Sun, Aug 13, 2017, 03:54 Joe Witt <joe.w...@gmail.com> wrote:

> Thanks Arun.  If you can document the steps you went through the setup
> a representative testing environment that would *greatly* help
> whomever reviews to do the same/similar.  The review pipeline for
> contributions like this can be tough and it is almost always due to
> helping the reviewer setup the environment.
>
> Thanks
>
> On Sat, Aug 12, 2017 at 12:48 PM, Arun Manivannan <a...@arunma.com> wrote:
> > Great.  Thanks a lot, Joe.  Appreciate your help.
> >
> > Will get it up and running before I assign the issue to myself.
> >
> > Regards,
> > Arun
> >
> > On Sun, Aug 13, 2017 at 3:24 AM Joe Witt <joe.w...@gmail.com> wrote:
> >
> >> Arun
> >>
> >> Very cool that you are planning to jump in on this.  Your approach
> >> sounds like a good start.  As far as system testing you're hitting on
> >> one of the more challenging parts of the equation here.  Your unit
> >> tests of course won't integrate with a real ES instance but manual
> >> testing can be done against a system as you mention.  I think a lot of
> >> folks use things like Docker to do such testing or sometimes these
> >> systems offer quick start configurations.  You could put a squid proxy
> >> instance/container in front as well.  Perhaps this one can help
> >> https://github.com/sameersbn/docker-squid
> >>
> >> Thanks
> >> Joe
> >>
> >> On Sat, Aug 12, 2017 at 11:49 AM, Arun Manivannan <a...@arunma.com>
> wrote:
> >> > Hi,
> >> >
> >> > Very Good morning.
> >> >
> >> > I would like to make an attempt at resolving NIFI-4198
> >> > <https://issues.apache.org/jira/browse/NIFI-4198>.  Looking at the
> >> code, I
> >> > would think by introducing the proxy url and authentication properties
> >> and
> >> > delegating them to the OkHttpClient would be a good way to do it.
> >> >
> >> >
> >>
> https://stackoverflow.com/questions/35554380/okhttpclient-proxy-authentication-how-to
> >> >
> >> > The question I have is beside the test cases, I would like to test it
> >> > against an actual elastic server to be absolutely sure of the fix.
> >> What's
> >> > the easiest/good way to run an elastic instance behind a proxy and
> test
> >> the
> >> > modified component?
> >> >
> >> > Thanks in advance.
> >> >
> >> > Cheers,
> >> > Arun
> >>
>


Re: NIFI-4198 - *ElasticsearchHttp processors do not expose Proxy settings

2017-08-12 Thread Arun Manivannan
Great.  Thanks a lot, Joe.  Appreciate your help.

Will get it up and running before I assign the issue to myself.

Regards,
Arun

On Sun, Aug 13, 2017 at 3:24 AM Joe Witt <joe.w...@gmail.com> wrote:

> Arun
>
> Very cool that you are planning to jump in on this.  Your approach
> sounds like a good start.  As far as system testing you're hitting on
> one of the more challenging parts of the equation here.  Your unit
> tests of course won't integrate with a real ES instance but manual
> testing can be done against a system as you mention.  I think a lot of
> folks use things like Docker to do such testing or sometimes these
> systems offer quick start configurations.  You could put a squid proxy
> instance/container in front as well.  Perhaps this one can help
> https://github.com/sameersbn/docker-squid
>
> Thanks
> Joe
>
> On Sat, Aug 12, 2017 at 11:49 AM, Arun Manivannan <a...@arunma.com> wrote:
> > Hi,
> >
> > Very Good morning.
> >
> > I would like to make an attempt at resolving NIFI-4198
> > <https://issues.apache.org/jira/browse/NIFI-4198>.  Looking at the
> code, I
> > would think by introducing the proxy url and authentication properties
> and
> > delegating them to the OkHttpClient would be a good way to do it.
> >
> >
> https://stackoverflow.com/questions/35554380/okhttpclient-proxy-authentication-how-to
> >
> > The question I have is beside the test cases, I would like to test it
> > against an actual elastic server to be absolutely sure of the fix.
> What's
> > the easiest/good way to run an elastic instance behind a proxy and test
> the
> > modified component?
> >
> > Thanks in advance.
> >
> > Cheers,
> > Arun
>


NIFI-4198 - *ElasticsearchHttp processors do not expose Proxy settings

2017-08-12 Thread Arun Manivannan
Hi,

Very Good morning.

I would like to make an attempt at resolving NIFI-4198
.  Looking at the code, I
would think by introducing the proxy url and authentication properties and
delegating them to the OkHttpClient would be a good way to do it.

https://stackoverflow.com/questions/35554380/okhttpclient-proxy-authentication-how-to

The question I have is beside the test cases, I would like to test it
against an actual elastic server to be absolutely sure of the fix.  What's
the easiest/good way to run an elastic instance behind a proxy and test the
modified component?

Thanks in advance.

Cheers,
Arun


Assigning JIRAs to myself

2017-08-11 Thread Arun Manivannan
Hi,

Good morning.

I would like to try to solve some beginner level issues.

Have read the developer's guide and the contributor guide.

How can I assign an issue to myself on the JIRA?  Great work on the
"beginner" tag.  Thanks !


Cheers,
Arun