Re: How to send back a reply from HandleHTTPRequest

2017-02-08 Thread James McMahon
This is very helpful. Thank you very much Pierre. -Jim

On Wed, Feb 8, 2017 at 2:44 AM, Pierre Villard 
wrote:

> James,
>
> If you always want to return 200 as HTTP response whatever the request is,
> then you could have:
> HandleHttpRequest directly linked to HandleHttpResponse with property
> status code to 200, and HandleHttpRequest linked to your flow to perform
> the expected tasks.
>
> Now let's say that you want to return a response based on the request,
> then you can set processors between the HandleHttpRequest and the
> HandleHttpResponse and then use the expression language to set the response
> code to the value contained by an attribute of the flow file based on the
> previous processors. Note that the content of the flow file will be used as
> content of the HTTP response.
>
> One example could be to expose a web service to retrieve files from HDFS
> (a bit like WebHDFS): the user sends a request with the path of the file to
> retrieve, then you link HandleHttpRequest to a FetchHDFS processor, then if
> the file exists (success relationship) you can link to a HandleHttpResponse
> with code 200 and the user will be able to get the file from HDFS, and if
> the file does not exist (or if there is any kind of issue, failure
> relationship), you can link the FetchHDFS to a HandleHttpResponse with an
> error code 4xx. Obviously you could add some complexity to your workflow:
> you could also allow users to send data to HDFS, etc, etc.
>
> Hope this clarify a bit.
>
> -Pierre
>
>
> 2017-02-07 23:51 GMT+01:00 James McMahon :
>
>> Can the response be a standard http code that is automatically returned
>> by the HandleHTTPResponse processor? Also, is it the Handle processor that
>> is determining the correct response and sending it back, or a different
>> processor? I guess I am still confused because I did not see any provision
>> in the configuration for HandleHTTPResponse that sent back an http status.
>> -Jim
>>
>> On Tue, Feb 7, 2017 at 4:08 PM, Aldrin Piri  wrote:
>>
>>> HI James,
>>>
>>> This would occur via the HandleHTTPResponse.  At a high level, the
>>> request flowfile is routed to the HandleHTTPResponse processor.  Of course,
>>> any kind of processing could occur between the two points.
>>>
>>> A simple example can be located on the sample templates page with the
>>> Hello_NiFi_Web_Service.xml [1].  In this case, it is performing a very
>>> simple replacement of text that is returned to the caller.
>>>
>>> Please let us know if you have any additional questions.
>>>
>>> --aldrin
>>>
>>> [1] https://cwiki.apache.org/confluence/display/NIFI/Example
>>> +Dataflow+Templates
>>>
>>> On Tue, Feb 7, 2017 at 4:04 PM, James McMahon 
>>> wrote:
>>>
 Good evening. I have a number of customer applications that will be
 posting content to a NiFi HandleHTTPRequest processor. These apps need an
 http reply so that they know status. I can't find in the configuration any
 way to do that. I know that I must be overlooking what must be a
 commonplace requirement. Can anyone tell me how they configured their
 workflow so that it provides a response and status to the request? Thanks
 very much. -Jim

>>>
>>>
>>
>


Re: Flowfile handling in C# is possible or not?

2017-02-08 Thread Matt Burgess
Prabhu,

There are a couple of ways I can think of for NiFi to be able to
communicate with an external application:

1) The InvokeHttp processor [1] can send the flow file content as the
payload and any number of flow file attributes as HTTP headers (you
can specify a regular expression for which attributes to send), so
your application could expose an HTTP endpoint and NiFi could point at
that.

2) NiFi instances can connect to each other (or any other application)
using the Site-to-Site protocol [2], there are both raw (socket-based)
and HTTP-based versions. This gives a more robust solution but
involves writing a Site-to-Site (S2S) client in C#/.NET for use in
your application. Koji has a great page on the wiki [3] describing
this (especially the HTTP(S) version) in detail.

If instead you are asking if you could have something like a custom
processor written in C#/.NET, that might be possible but NiFi runs on
the JVM so you'd need to cross-compile [4] or have a bridge [5]
perhaps.

Regards,
Matt

[1] 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.InvokeHTTP/index.html
[2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
[3] 
https://cwiki.apache.org/confluence/display/NIFI/Support+HTTP%28S%29+as+a+transport+mechanism+for+Site-to-Site
[4] http://xmlvm.org/clr2jvm/
[5] http://jni4net.com/


On Tue, Feb 7, 2017 at 11:27 PM, prabhu Mahendran
 wrote:
> Since processors code written as java  which handles flow file read and
> write with relationship transfer.
>
> Is it possible for flow file handling in .net application?
>
> Many thanks,
> prabhu


FINAL REMINDER: CFP for ApacheCon closes February 11th

2017-02-08 Thread Rich Bowen
Dear Apache Enthusiast,

This is your FINAL reminder that the Call for Papers (CFP) for ApacheCon
Miami is closing this weekend - February 11th. This is your final
opportunity to submit a talk for consideration at this event.

This year, we are running several mini conferences in conjunction with
the main event, so if you're submitting for one of those events, please
pay attention to the instructions below.

Apache: Big Data
* Event information:
http://events.linuxfoundation.org/events/apache-big-data-north-america
* CFP:
http://events.linuxfoundation.org/events/apache-big-data-north-america/program/cfp

Apache: IoT (Internet of Things)
* Event Information: http://us.apacheiot.org/
* CFP -
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
(Indicate 'IoT' in the Target Audience field)

CloudStack Collaboration Conference
* Event information: http://us.cloudstackcollab.org/
* CFP -
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
(Indicate 'CloudStack' in the Target Audience field)

FlexJS Summit
* Event information - http://us.apacheflexjs.org/
* CFP -
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
(Indicate 'Flex' in the Target Audience field)

TomcatCon
* Event information - https://tomcat.apache.org/conference.html
* CFP -
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
(Indicate 'Tomcat' in the Target Audience field)

All other topics and projects
* Event information -
http://events.linuxfoundation.org/events/apachecon-north-america/program/about
* CFP -
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp

Admission to any of these events also grants you access to all of the
others.

Thanks, and we look forward to seeing you in Miami!

-- 
Rich Bowen
VP Conferences, Apache Software Foundation
rbo...@apache.org
Twitter: @apachecon



(You are receiving this email because you are subscribed to a dev@ or
users@ list of some Apache Software Foundation project. If you do not
wish to receive email from these lists any more, you must follow that
list's unsubscription procedure. View the headers of this message for
unsubscription instructions.)


RE: GetTwitter - Security/Certificate Issue

2017-02-08 Thread Dan Giannone
Hi Aldrin,

This was with the original package. Also, I’ve attached the verbose output.

Thanks,

Dan

From: Aldrin Piri [mailto:aldrinp...@gmail.com]
Sent: Tuesday, February 07, 2017 5:18 PM
To: users@nifi.apache.org
Subject: Re: GetTwitter - Security/Certificate Issue

Hi Dan,

Was this with an updated ca-certificates package or the original one listed 
when this conversation started?

Should have asked from this initially, but could you also please provide the 
output with verbose logging for the curl command?

curl -v  https://stream.twitter.com/

--Aldrin

On Tue, Feb 7, 2017 at 4:39 PM, Dan Giannone 
mailto:dgiann...@humana.com>> wrote:
Hi Aldrin,

Here is a screenshot of the result. Looks like there is definitely an issue. 
Please let me know if this sheds any light on the issue.

Thanks,

Dan

From: Aldrin Piri [mailto:aldrinp...@gmail.com]
Sent: Monday, February 06, 2017 9:07 PM

To: users@nifi.apache.org
Subject: Re: GetTwitter - Security/Certificate Issue

Hi Dan,

Just as a quick diagnostic, are you able to curl https://stream.twitter.com/?  
This will report in being unauthorized, but will at least confirm that the 
network connectivity with the associated endpoint used by the processor has 
appropriate access.  I have seen in certain environments that network 
proxies/filters can attempt to intervene in such requests causing similar 
errors to manifest.

Please let us know your results.

--aldrin

On Fri, Feb 3, 2017 at 8:32 AM, Dan Giannone 
mailto:dgiann...@humana.com>> wrote:
Hi Aldrin,

The machine in question is a linux server that we use as our ‘sandbox’ to try 
new things (nifi in this case), so I can definitely upgrade the yum package. As 
for your second question, the server runs on my company’s network, but other 
than that I don’t see any other considerations. Any thoughts?

-Dan

From: Aldrin Piri [mailto:aldrinp...@gmail.com]
Sent: Wednesday, February 01, 2017 5:05 PM

To: users@nifi.apache.org
Subject: Re: GetTwitter - Security/Certificate Issue

Hi Dan,

I did a bit of poking around and was not able to find that exact RPM version, 
but was not able to recreate with the CA certs from similar RPMs.  As a quick 
check, is upgrading the mentioned yum package a possibility on the system?

Are there any intervening network considerations or is the machine in question 
directly accessing the internet?

On Wed, Feb 1, 2017 at 12:35 PM, Dan Giannone 
mailto:dgiann...@humana.com>> wrote:
Hi Aldrin,

The version of jdk being used is 1.8. The details of the packages are attached 
in the PNG files. Please let me know if you need any additional info to help 
diagnose the issue!

Thanks,

Dan


From: Aldrin Piri [mailto:aldrinp...@gmail.com]
Sent: Tuesday, January 31, 2017 2:20 PM
To: users@nifi.apache.org
Subject: Re: GetTwitter - Security/Certificate Issue

Hi Dan,

The GetTwitter processor does not make use of an Apache NiFi SSLContextService 
so the certificate chain issues are likely more tied to the JVM/OS 
specifically.  Did a quick check on some of the instances I am running and 
Twitter seems to be operating normally.

Could you share some more details about your environment, specifically JRE 
being used?  If you are running a Linux variant, is your ca-certificates 
package (Yum based: ca-certificates, Aptitude based: 
ca-cerificates/ca-certificates-java) up to date?  If so, what version is the 
package (Yum based: yum info ca-certificates, Aptitude based: apt-cache showpkg 
ca-certificates)?

Thanks,
Aldrin


On Tue, Jan 31, 2017 at 1:28 PM, Andy LoPresto 
mailto:alopre...@apache.org>> wrote:
Hi Dan,

Yes, currently your processor is saying that it receives a certificate 
identifying https://www.twitter.com (or whatever the actual URL is) but it 
cannot build a complete chain between the presented certificate and a known 
CA/trusted certificate. This is because by default, NiFi doesn’t know any 
trusted certificates.

You can configure a StandardSSLContextService in Controller Services which 
points the *truststore file* to $JRE_HOME/lib/security/cacerts (for example, on 
my Mac, it is 
/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/security/cacerts),
 and set the *truststore type* to JKS and the *truststore password* to 
“changeit”.

There is an existing Jira discussing adding this by default [1], but there are 
pros and cons to that decision.

[1] 
https://issues.apache.org/jira/browse/NIFI-1477?jql=text%20~%20%22truststore%22%20AND%20project%20%3D%20%22Apache%20NiFi%22

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

send contents of web page to a remote nifi instance

2017-02-08 Thread mohammed shambakey
Hi

I'm sorry if the question is silly, but it is giving me a hard time. We
have a web page that contain some inputs (e.g., userid and other
parameters) and I want to sent these parameters to a remote nifi-instance.

I think I should use "handlHTTPRequst" processor at the remote instance,
but I'm not sure how the web page can specify the address of the
"handlHTTPRequest" processor at the remote NIFI site (e.g., we have the IP
address of the remote NIFI instance, but how to specify the address of the
"handleHTTPPequst" processor)?

I've seen some examples on NIFI docs about "hendleHTTPRequest" but they
don't have the web page to NIFI instance. I wonder if there are other
examples for this case?

Regards

-- 
Mohammed


Re: send contents of web page to a remote nifi instance

2017-02-08 Thread Matt Burgess
Mohammed,

HandleHttpRequest [1] allows you to specify the listening port as well
as Allowed Paths. Using the hostname/IP of the NiFi instance, along
with the Listening Port and Allowed Paths, creates an endpoint to
which you can issue HTTP commands (GET, PUT, POST -- all can be
allowed or denied via the processor properties). I think under the
hood the processor spawns Jetty with the configured properties to
accept the request(s).

So for a hostname of "nifi.mydomain.com", with a listening port of
8989 and an Allowed Path of /sendParameters, you could POST to
http://nifi.mydomain.com:8989/sendParameters and the (running)
HandleHttpRequest processor would accept it. Check the documentation
and example Hello_NiFi_Web_Service [2] for usage patterns, such as
using a downstream HandleHttpResponse processor in order to return a
response from the request.

Regards,
Matt

[1] 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpRequest/index.html
[2] https://cwiki.apache.org/confluence/display/NIFI/Example+Dataflow+Templates

On Wed, Feb 8, 2017 at 11:55 AM, mohammed shambakey
 wrote:
> Hi
>
> I'm sorry if the question is silly, but it is giving me a hard time. We have
> a web page that contain some inputs (e.g., userid and other parameters) and
> I want to sent these parameters to a remote nifi-instance.
>
> I think I should use "handlHTTPRequst" processor at the remote instance, but
> I'm not sure how the web page can specify the address of the
> "handlHTTPRequest" processor at the remote NIFI site (e.g., we have the IP
> address of the remote NIFI instance, but how to specify the address of the
> "handleHTTPPequst" processor)?
>
> I've seen some examples on NIFI docs about "hendleHTTPRequest" but they
> don't have the web page to NIFI instance. I wonder if there are other
> examples for this case?
>
> Regards
>
> --
> Mohammed


Re: send contents of web page to a remote nifi instance

2017-02-08 Thread mohammed shambakey
Thanks Matt

On Wed, Feb 8, 2017 at 3:22 PM, Matt Burgess  wrote:

> Mohammed,
>
> HandleHttpRequest [1] allows you to specify the listening port as well
> as Allowed Paths. Using the hostname/IP of the NiFi instance, along
> with the Listening Port and Allowed Paths, creates an endpoint to
> which you can issue HTTP commands (GET, PUT, POST -- all can be
> allowed or denied via the processor properties). I think under the
> hood the processor spawns Jetty with the configured properties to
> accept the request(s).
>
> So for a hostname of "nifi.mydomain.com", with a listening port of
> 8989 and an Allowed Path of /sendParameters, you could POST to
> http://nifi.mydomain.com:8989/sendParameters and the (running)
> HandleHttpRequest processor would accept it. Check the documentation
> and example Hello_NiFi_Web_Service [2] for usage patterns, such as
> using a downstream HandleHttpResponse processor in order to return a
> response from the request.
>
> Regards,
> Matt
>
> [1] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.standard.HandleHttpRequest/index.html
> [2] https://cwiki.apache.org/confluence/display/NIFI/
> Example+Dataflow+Templates
>
> On Wed, Feb 8, 2017 at 11:55 AM, mohammed shambakey
>  wrote:
> > Hi
> >
> > I'm sorry if the question is silly, but it is giving me a hard time. We
> have
> > a web page that contain some inputs (e.g., userid and other parameters)
> and
> > I want to sent these parameters to a remote nifi-instance.
> >
> > I think I should use "handlHTTPRequst" processor at the remote instance,
> but
> > I'm not sure how the web page can specify the address of the
> > "handlHTTPRequest" processor at the remote NIFI site (e.g., we have the
> IP
> > address of the remote NIFI instance, but how to specify the address of
> the
> > "handleHTTPPequst" processor)?
> >
> > I've seen some examples on NIFI docs about "hendleHTTPRequest" but they
> > don't have the web page to NIFI instance. I wonder if there are other
> > examples for this case?
> >
> > Regards
> >
> > --
> > Mohammed
>



-- 
Mohammed


ControlRate across cluster

2017-02-08 Thread Nick Carenza
I have been running a standalone instance of Nifi and am preparing a move
into a cluster configuration. One aspect I am curious about is how
ControlRate is going to operate with n nodes. I am using control rate to
satisfy rate-limit requirements for external services.

My flow looks something like:

...
> ControlRate count 5000/sec
> ControlRate data   5MB/sec
> PutKinesisFirehose batch 500, buffer 4MB


I am trying to figure out how to throttle when I add a second node which
will be running the same flow.

ControlRate might already run on the primary node only. I noticed in code
it had the @TriggerSerially annotation which is in common with ListS3, and
ListSFTP which are isolated processors that only run on the primary node. I
don't know exactly what defines a processor as isolated though. If
ControlRate is not isolated, one option would be to make it (optionally)
so. The description doesn't explicitly say if it is or not and I couldn't
find anything related to isolated processors in the developer guide. Only
the admin-guide seems to use that terminology. Does anyone have some
insight on this?

I could divide the count and data rates on each ControlRate to
rate-limit/node-count. With batching though I think they might be able to
exceed the rate limit of a given stream unless I also divided batch sizes.
This option seems not great because I don't want to have to update
properties when adding/removing nodes.

https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering
http://docs.aws.amazon.com/firehose/latest/dev/limits.html

Thanks,
Nick


ConsumeKafka processor erroring when held up by full queue

2017-02-08 Thread Nick Carenza
Hey team, I have a ConsumeKafka_0_10 running which normally operates
without problems. I had a queue back up due to a downstream processor and I
started getting these bulletins.

01:16:01 UTC WARNING a46d13dd-3231-1bff-1a99-1eaf5f37e1d2
ConsumeKafka_0_10[id=a46d13dd-3231-1bff-1a99-1eaf5f37e1d2] Duplicates are
likely as we were able to commit the process session but received an
exception from Kafka while committing offsets.

01:16:01 UTC ERROR a46d13dd-3231-1bff-1a99-1eaf5f37e1d2
ConsumeKafka_0_10[id=a46d13dd-3231-1bff-1a99-1eaf5f37e1d2] Exception while
interacting with Kafka so will close the lease
org.apache.nifi.processors.kafka.pubsub.ConsumerPool$SimpleConsumerLease@87d2ac1
due to org.apache.kafka.clients.consumer.CommitFailedException: Commit
cannot be completed since the group has already rebalanced and assigned the
partitions to another member. This means that the time between subsequent
calls to poll() was longer than the configured session.timeout.ms, which
typically implies that the poll loop is spending too much time message
processing. You can address this either by increasing the session timeout
or by reducing the maximum size of batches returned in poll() with
max.poll.records.

My max.poll.records is set to 1 on my consumer and session.timeout.ms
is the default 1 on the server.

Since there is no such thing as coincidences, I believe this has to do with
it not being able to push received messages to the downstream queue.

If my flow is backed up, I expect the ConsumKafka processor not to throw
errors but continue to heartbeat with the Kafka server and resume consuming
once it can commit to the downstream queue?

Might I have the server or consumer misconfigured to handle this scenario
or should the consumer not be throwing this error?

Thanks,
- Nick


Custom Processor with Hive database connection pooling service not working

2017-02-08 Thread cool nijandhan
Hi

I am developing custom processor with hive database connection pooling
service. Added necessary dependency in pom file and added necessary folders
in custom processor folder. I have created class file and able to generate
nar file. Placed nar file in lib directory and restarted nifi server. Its
look like everything working fine. But services are not showing in database
connection dropdown box. Instead it showing connection id, but every time
services are creating when click "create" in controller section. For other
processors, it seems working fine. Please find the attached screenshots.

Any help appreciated.

Thanks


Re: Problem when using backpressure to distribute load over nodes in a cluster

2017-02-08 Thread Koji Kawamura
Hi Bas,

Sorry for the late reply.

Thanks for the clarification, I over simplified the flow. As you
experienced, NiFi back pressure is handled per relationship and as
long as a relationship has room to receive new flow files, source
processor is scheduled to run.

I don't think there's an existing solution to block the source
processor as you desired.

However, I found a possible improvement to achieve that.
NiFi scheduler checks downstream relationship availability, when it's
full, the processor won't be scheduled to run.
In case a source processor has multiple outgoing relationships, and if
ANY of those is full, the processor won't be scheduled.

(This is how processor scheduling works with back-pressure, but can
alter with @TriggerWhenAnyDestinationAvailable annotation.
DistributeLoad is the only processor annotated with this)

So, I think we could use this mechanism to keep the source processor
waiting to be scheduled, by following flow:

GetSQS
  -- success --> FetchS3Object --> Parse --> Notify
  -- success --> Wait

I propose to improve Wait so that user can choose how waiting FlowFile
is handled, from either:
"Route to 'wait' relationship" or "Keep in the Upstream connection".
Currently it has only option to route to 'wait'.

Use "Keep in the Upstream connection" Wait mode with the flow above,
the incoming flow file in GetSQS -> Wait connection stays there until
actual data processing finishes and Notify sends a notification
signal.

I will experiment with this idea and if it works,
I'll submit a JIRA for this and try to add this capability since I've
been working on Wait/Notify processors recently.

Thanks again for sharing your use-case!

Koji


On Tue, Feb 7, 2017 at 6:05 PM, Bas van Kortenhof
 wrote:
> Hi Koji,
>
> Thanks for the quick response. I have set the batch size to 1 indeed, and
> the flow you describe works, but my problem is a bit more complex. I'll try
> to show it with an example:
>
>
>
> In this case Node 1 is parsing a flow file (indicated by the X in the
> connection between FetchS3Object and Parse). Both connections have a
> backpressure threshold of 1, but because the object is already fetched, the
> first connection is empty and can thus be filled. This means that, if a new
> item becomes available in the queue, both of the following cases can happen
> with equal probability:
>
>
>
> I'd like to force the second case to happen, because node 2 has more
> resources available.
>
> I hope this explains the situation a bit better. So basically I want the
> backpressure to occur based on a threshold on the whole flow, not an
> individual connection. I haven't found a way to do this up to this point.
>
> Hopefully you have an idea how to achieve this.
>
> Regards,
> Bas
>
>
>
> --
> View this message in context: 
> http://apache-nifi-users-list.2361937.n4.nabble.com/Problem-when-using-backpressure-to-distribute-load-over-nodes-in-a-cluster-tp863p877.html
> Sent from the Apache NiFi Users List mailing list archive at Nabble.com.