Re: Restricting Controller Service Class Definitions

2017-08-14 Thread Michael Hogue
I had thought about the interface extension, but thought i should probably
field the question first. Thanks much for the prompt feedback. I'll go
forward with the recommended solution.

Thanks!

On Mon, Aug 14, 2017 at 12:12 PM Andy LoPresto  wrote:

> I don’t think this extends to the general case, but in this instance, I
> support Bryan’s first suggestion. The RestrictedSSLContextService interface
> will extend the SSLContextService interface, and the
> StandardRestrictedSSLContextService class can implement that interface and
> extend StandardSSLContextService and just override the methods related to
> getting the supported protocols. I think this is the cleanest solution for
> the issue.
>
> Thanks for doing this, Mike.
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Aug 14, 2017, at 12:09 PM, Bryan Bende  wrote:
>
> Hi Michael,
>
> Generally processors are supposed to only know about an interface for
> the service, and then the framework makes the implementations of that
> interface available. So the processors shouldn't really know which
> specific implementations are available.
>
> I think there are a couple of things that could be done...
>
> You can make a new interface called RestrictedSSLContextService that
> extends SSLContextService, and then have an implementation like
> StandardRestrictedSSLContextService... then make your customized
> ListenHTTP processor rely only on the new interface
> (RestrictedSSLContextService).
>
> Another option might be to do something involving custom validate (as
> you mentioned)... If the SSLContextService interface had a method to
> return a list/array of the supported protocols, then a processor could
> use those values in a custom validate to make sure that a service with
> the required protocols was selected. You are right that they would
> still see all the services in the drop-down, but the processor would
> be invalid and unable to be started until selecting the appropriate
> service, and they'd see the validation message about what was wrong.
>
> A final option, although not something that is currently possible,
> might be to leverage the security model... If you could restrict who
> could create a specific type of component, meaning something like
> "admin has create permissions on StandardSSLContext from
> nifi-standard-nar", then you could set it up so that regular users can
> create your restricted service and only admins (or maybe no one) can
> create the other kind.
>
> -Bryan
>
>
> On Mon, Aug 14, 2017 at 11:35 AM, Michael Hogue
>  wrote:
>
> All,
>
>   I'm in the process of making some changes to a processor which exposes a
> controller service with several implementations. However, I only want to
> allow a particular implementation for the processor, but i've not found a
> clean way to do this. The rationale behind wanting to do this can be found
> in the conversation on PR #1986 [1]. In short, I've written a
> RestrictedSSLContextService that allows only a specific set of SSL
> algorithms to be chosen. I want to change ListenHTTP to allow only that
> implementation and not the StandardSSLContextService.
>
>   PropertyDescriptor builders have a method
> identifiesControllerService(clazz) which allows you to dictate which
> interface the controller service must implement. This is great because it
> should allow me to specify an explicit implementation i'd like to force the
> processor to allow. The problem with this is that it necessitates an
> additional dependency on a non-API module, which i believe is ill-advised.
> It actually results in multiple identical controller service entries when
> you go to configure the controller service in the UI due to nar service
> loading. This is probably a bad thing.
>
>   I've looked across the code base and don't really see an example of
> restricting controller service options to specific implementations if you
> only want to allow a subset, for example. Adding a validator wouldn't
> really work either since the UI would still allow you to choose a
> controller service you don't want to allow. My question to those more
> familiar with the codebase is whether there's an obvious way to approach
> this or if there needs to be significant changes to allow it.
>
> Thanks,
> Mike
>
> [1] https://github.com/apache/nifi/pull/1986
>
>
>


Re: Restricting Controller Service Class Definitions

2017-08-14 Thread Andy LoPresto
I don’t think this extends to the general case, but in this instance, I support 
Bryan’s first suggestion. The RestrictedSSLContextService interface will extend 
the SSLContextService interface, and the StandardRestrictedSSLContextService 
class can implement that interface and extend StandardSSLContextService and 
just override the methods related to getting the supported protocols. I think 
this is the cleanest solution for the issue.

Thanks for doing this, Mike.

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 14, 2017, at 12:09 PM, Bryan Bende  wrote:
> 
> Hi Michael,
> 
> Generally processors are supposed to only know about an interface for
> the service, and then the framework makes the implementations of that
> interface available. So the processors shouldn't really know which
> specific implementations are available.
> 
> I think there are a couple of things that could be done...
> 
> You can make a new interface called RestrictedSSLContextService that
> extends SSLContextService, and then have an implementation like
> StandardRestrictedSSLContextService... then make your customized
> ListenHTTP processor rely only on the new interface
> (RestrictedSSLContextService).
> 
> Another option might be to do something involving custom validate (as
> you mentioned)... If the SSLContextService interface had a method to
> return a list/array of the supported protocols, then a processor could
> use those values in a custom validate to make sure that a service with
> the required protocols was selected. You are right that they would
> still see all the services in the drop-down, but the processor would
> be invalid and unable to be started until selecting the appropriate
> service, and they'd see the validation message about what was wrong.
> 
> A final option, although not something that is currently possible,
> might be to leverage the security model... If you could restrict who
> could create a specific type of component, meaning something like
> "admin has create permissions on StandardSSLContext from
> nifi-standard-nar", then you could set it up so that regular users can
> create your restricted service and only admins (or maybe no one) can
> create the other kind.
> 
> -Bryan
> 
> 
> On Mon, Aug 14, 2017 at 11:35 AM, Michael Hogue
>  wrote:
>> All,
>> 
>>   I'm in the process of making some changes to a processor which exposes a
>> controller service with several implementations. However, I only want to
>> allow a particular implementation for the processor, but i've not found a
>> clean way to do this. The rationale behind wanting to do this can be found
>> in the conversation on PR #1986 [1]. In short, I've written a
>> RestrictedSSLContextService that allows only a specific set of SSL
>> algorithms to be chosen. I want to change ListenHTTP to allow only that
>> implementation and not the StandardSSLContextService.
>> 
>>   PropertyDescriptor builders have a method
>> identifiesControllerService(clazz) which allows you to dictate which
>> interface the controller service must implement. This is great because it
>> should allow me to specify an explicit implementation i'd like to force the
>> processor to allow. The problem with this is that it necessitates an
>> additional dependency on a non-API module, which i believe is ill-advised.
>> It actually results in multiple identical controller service entries when
>> you go to configure the controller service in the UI due to nar service
>> loading. This is probably a bad thing.
>> 
>>   I've looked across the code base and don't really see an example of
>> restricting controller service options to specific implementations if you
>> only want to allow a subset, for example. Adding a validator wouldn't
>> really work either since the UI would still allow you to choose a
>> controller service you don't want to allow. My question to those more
>> familiar with the codebase is whether there's an obvious way to approach
>> this or if there needs to be significant changes to allow it.
>> 
>> Thanks,
>> Mike
>> 
>> [1] https://github.com/apache/nifi/pull/1986



signature.asc
Description: Message signed with OpenPGP using GPGMail


Restricting Controller Service Class Definitions

2017-08-14 Thread Michael Hogue
All,

   I'm in the process of making some changes to a processor which exposes a
controller service with several implementations. However, I only want to
allow a particular implementation for the processor, but i've not found a
clean way to do this. The rationale behind wanting to do this can be found
in the conversation on PR #1986 [1]. In short, I've written a
RestrictedSSLContextService that allows only a specific set of SSL
algorithms to be chosen. I want to change ListenHTTP to allow only that
implementation and not the StandardSSLContextService.

   PropertyDescriptor builders have a method
identifiesControllerService(clazz) which allows you to dictate which
interface the controller service must implement. This is great because it
should allow me to specify an explicit implementation i'd like to force the
processor to allow. The problem with this is that it necessitates an
additional dependency on a non-API module, which i believe is ill-advised.
It actually results in multiple identical controller service entries when
you go to configure the controller service in the UI due to nar service
loading. This is probably a bad thing.

   I've looked across the code base and don't really see an example of
restricting controller service options to specific implementations if you
only want to allow a subset, for example. Adding a validator wouldn't
really work either since the UI would still allow you to choose a
controller service you don't want to allow. My question to those more
familiar with the codebase is whether there's an obvious way to approach
this or if there needs to be significant changes to allow it.

Thanks,
Mike

[1] https://github.com/apache/nifi/pull/1986


Need help in designing the solution.

2017-08-14 Thread Irfan Basha Sheik
Hi,

I have a use case where "I am reading bunch of rows from a database table,
apply some rules(eg. amount > 100) and add additional columns depending on
the rules that got satisfied, and then store the updated rows into elastic
search".

The approach I have in mind at the moment is :

1. Use "EvaluateJsonPath" processor to convert "content" into "attributes"
2. Use "UpdateAttribute" processor to run all the rules(defined using
Nifi's Expression Language (EL)) and add the additional attributes.

*Drawbacks:*

1. If the rows contain data of type "blob" or "clob" then EvaluateJsonPath
processor may lead to memory out of bounds as it fetches all the content on
to heap for processing
2. Not sure how to update the additional attributes added by
"UpdateAttribute" processor to the actual content(or append to original
rows)

Probably, "ExecuteScript" processor with Nifi's EL support can be at
rescue. But wanted to know which option will be better. Any suggestion/help
on this will be very much appreciated.

Thanks.


[NiFi-4290] Re: NiFi 1.4: PublishKafkaRecord_0_10: failed to process due to java.lang.NullPointerException

2017-08-14 Thread mayank rathi
Hello Joe,

I have added all the details to NiFi-4290. Please let me know if you need
any other information from my side.

Thanks
Mayank

On Sat, Aug 12, 2017 at 12:00 AM, Joe Witt  wrote:

> Thanks for reporting the issue.  It appears the NPE can occur if zero
> records end up getting published and it fails to lookup the schema
> because the message tracker is not yet initialized.
>
> So a couple things:
> 1) Verify the settings being used to lookup the schema.  You can show
> the record reader, writer, and schema registry settings being used for
> that publisher.
> 2) You do not need to SplitAvro before sending it via publish kafka
> record.  Avoiding doing the split beforehand can result in vastly
> superior performance.
>
> Thanks
> Joe
>
> On Fri, Aug 11, 2017 at 8:03 PM, mayank rathi 
> wrote:
> > Hello Joe,
> >
> > JIRA logged.
> >
> > https://issues.apache.org/jira/browse/NIFI-4290
> >
> > Thanks!!
> >
> > On Fri, Aug 11, 2017 at 9:48 PM, Joe Witt  wrote:
> >
> >> Hello
> >>
> >> Can you please file a jira and attach the logs to the jira.
> >>
> >> Thanks
> >> Joe
> >>
> >> On Aug 11, 2017 6:23 PM, "mayank rathi"  wrote:
> >>
> >> > Attached are the logs after setting processors in Debug mode. Here is
> the
> >> > flow:
> >> >
> >> > ExecuteSQL --> SplitAvro --> PublishKafkaRecord_0_10
> >> >
> >> > Thanks!!
> >> >
> >> > On Fri, Aug 11, 2017 at 9:01 PM, mayank rathi  >
> >> > wrote:
> >> >
> >> >> Hello All,
> >> >>
> >> >> I am moving data to Kafka using NiFi's PublishKafkaRecord processor.
> I
> >> am
> >> >> using ConfluentSchemaRegistry Controller service and getting below
> >> error:
> >> >>
> >> >> 2017-08-11 20:54:25,937 ERROR [Timer-Driven Process Thread-4]
> >> >> o.a.n.p.k.pubsub.PublishKafkaRecord_0_10
> PublishKafkaRecord_0_10[id=b3c
> >> >> 03961-015d-1000-0946-79ccbe2ffbbd] PublishKafkaRecord_0_10[id=b3c
> >> >> 03961-015d-1000-0946-79ccbe2ffbbd] failed to process due to
> >> >> java.lang.NullPointerException; rolling back session: {}
> >> >> java.lang.NullPointerException: null
> >> >>
> >> >> I do not see any error on Kafka side.
> >> >>
> >> >> How can I debug and resolve this issue?
> >> >>
> >> >> Thanks!!
> >> >>
> >> >> --
> >> >> NOTICE: This email message is for the sole use of the intended
> >> >> recipient(s) and may contain confidential and privileged information.
> >> Any
> >> >> unauthorized review, use, disclosure or distribution is prohibited.
> If
> >> you
> >> >> are not the intended recipient, please contact the sender by reply
> email
> >> >> and destroy all copies of the original message.
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > NOTICE: This email message is for the sole use of the intended
> >> > recipient(s) and may contain confidential and privileged information.
> Any
> >> > unauthorized review, use, disclosure or distribution is prohibited. If
> >> you
> >> > are not the intended recipient, please contact the sender by reply
> email
> >> > and destroy all copies of the original message.
> >> >
> >>
> >
> >
> >
> > --
> > NOTICE: This email message is for the sole use of the intended
> recipient(s)
> > and may contain confidential and privileged information. Any unauthorized
> > review, use, disclosure or distribution is prohibited. If you are not the
> > intended recipient, please contact the sender by reply email and destroy
> > all copies of the original message.
>



-- 
NOTICE: This email message is for the sole use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy
all copies of the original message.


Re: how to execute code when processor is stopping

2017-08-14 Thread 尹文才
Thanks Koji, this is exactly what I'm looking for.

Regards,
Ben

2017-08-14 12:21 GMT+08:00 Koji Kawamura :

> Hi Ben,
>
> AbstractSessionFactoryProcessor has a protected isScheduled() method,
> that can be used by a processor implementation class to check whether
> it is still being scheduled (not being stopped).
> For an example, ConsumeKafka_0_10.onTrigger uses it with while loop:
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-kafka-bundle/nifi-kafka-0-10-processors/
> src/main/java/org/apache/nifi/processors/kafka/pubsub/
> ConsumeKafka_0_10.java#L316
>
> Thanks,
> Koji
>
> On Mon, Aug 14, 2017 at 11:12 AM, 尹文才  wrote:
> > Hi guys, about my case, I have another question, if I implement the retry
> > logic inside the ontrigger method and I need to retry until the database
> > connection is back online, in case user needs to stop the processor in
> NIFI
> > UI while the database is still offline, according to my understanding
> > ontrigger will keep executing the retry logic and the processor couldn't
> be
> > stopped even if user tries to stop it, is there any way to solve this
> > problem? Thanks.
> >
> > Regards,
> > Ben
> >
> > 2017-08-12 6:30 GMT+08:00 尹文才 :
> >
> >> Hi Bryan and Matt, thanks for all your suggestions, I was trying to make
> >> sure that the OnUnscheduled method was not called too frequently when
> the
> >> connection is offline.
> >> You guys were right, these sort of logic should not be placed inside the
> >> scheduling methods, I need to refactor my code to place them into
> onTrigger.
> >>
> >> Regards,
> >> Ben
> >>
> >> 2017-08-12 0:53 GMT+08:00 Matt Burgess :
> >>
> >>> I'm a fan of Bryan's last suggestion. For dynamic/automatic retry
> >>> (such as database connection retries), I recommend putting the
> >>> connection logic in the onTrigger() method. If you can check
> >>> connectivity, then your onTrigger() would know whether it needs to try
> >>> to reconnect before it does any work. If it tries to reconnect and is
> >>> unsuccessful, you can yield the processor if you want, so as not to
> >>> hammer the DB with connection attempts. The CaptureChangeMySQL
> >>> processor does this, it has a retry loop for trying various nodes in a
> >>> MySQL cluster, but once it's connected, it goes on about its work, and
> >>> if a connection fails, it will retry the connection loop before it
> >>> does any more work. It only uses onTrigger and none of the scheduling
> >>> stuff.
> >>>
> >>> Regards,
> >>> Matt
> >>>
> >>> On Fri, Aug 11, 2017 at 11:06 AM, Bryan Bende 
> wrote:
> >>> > Ben,
> >>> >
> >>> > I apologize if I am not understanding the situation, but...
> >>> >
> >>> > In the case where your OnScheduled code is in a retry loop, if
> someone
> >>> > stops the processor it will call your OnUnscheduled code which will
> >>> > set the flag to bounce out of the loop. This sounds like what you
> >>> > want, right?
> >>> >
> >>> > In the case where OnScheduled times out, the framework is calling
> >>> > OnUnscheduled which would call your code to set the flag, but
> wouldn't
> >>> > that not matter at this point because you aren't looping anymore
> >>> > anyway?
> >>> >
> >>> > If the framework calls OnScheduled again, your code should set the
> >>> > flag back to whatever it needs to be to start looping again right?
> >>> >
> >>> > An alternative that might avoid some of this would be to lazily
> >>> > initialize the connection in the onTrigger method of the processor.
> >>> >
> >>> > -Bryan
> >>> >
> >>> >
> >>> > On Fri, Aug 11, 2017 at 9:16 AM, 尹文才  wrote:
> >>> >> thanks Pierre, my case is that I need to implement a database
> >>> connection
> >>> >> retry logic inside my OnScheduled method, when the database is not
> >>> >> available I will retry until the connection is back online.
> >>> >> The problem is when the database is offline it will throw timed out
> >>> >> execution exception inside OnScheduled and then call OnUnscheduled.
> But
> >>> >> when I manually stop the processor the OnUnsheduled
> >>> >> will also get called. I know my logic sounds a little weird but I
> need
> >>> to
> >>> >> set some flag in the OnUnscheduled method to stop the retry logic
> >>> inside
> >>> >> OnScheduled in order to be able to stop the processor,
> >>> >> otherwise the processor is not able to be stopped unless I restart
> the
> >>> >> whole NIFI.
> >>> >>
> >>> >> Regards,
> >>> >> Ben
> >>> >>
> >>> >> 2017-08-11 17:18 GMT+08:00 Pierre Villard <
> pierre.villard...@gmail.com
> >>> >:
> >>> >>
> >>> >>> Oh OK, get it now!
> >>> >>>
> >>> >>> Not sure what's your use case, but I don't think you can do that
> >>> unless you
> >>> >>> set some information when the process actually executes onTrigger
> for
> >>> the
> >>> >>> first time and you then check this value in your OnUnscheduled
> >>> annotated
> >>> >>> method.
> >>> >>>
> >>>