Re: Facing Issue while connecting with HDFS

2015-12-10 Thread digvijayp
Hi Bryan,
So in edge node approach how data sent in site-to-site ?I mean to say is it
using any protocol to transfer it like FTP,SFTP.
As you are saying If both clusters can fully talk to each other than you
don't need this edge node approach, you could just have a NiFi instance, or
cluster, that pulls from one HDFS and pushes to the other.
so my query is we have to use FetchHDFS/getHDFS process which get data from
HDFS to local machine and putHDFS process which load data from local machine
to HDFS.I dont have yo use the local machin in between .So how can we manage
the transfer data without using local machine? Where can we do such
configuration in nifi? 

Thanks in advance.

Digvijay P.



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5712.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: Asynchronous JMS Consumer for IBM MQ

2015-12-10 Thread ianwork
Bryan/Aldrin, Adding yielding into my processor prevent the number of tasks
was rapidly increasing. Thanks!

Aldrin, I would like to dig a little more into the details. My application
is basically set do process logs like logstash.  The application is reading
and parsing a high volume of logs. The listener is based up listenSyslog. 
The listener forwards the logs to various processors which run regex's on
batches of logs. Increasing the number of regex processors reduces the
performance so i'd like to determine how I can configure the system
resources.   

I'm still struggling with run duration.  Does setting run duration mean that
when a thread is allocated to the ontrigger method of that processor it will
run for a maximum of run duration? What if the method executes faster than
the run duration, will ontrigger be called again if there is work to be
done?  


Is the event driven mode something to consider in my type of processor? 
What use cases was that designed to satisfy and is there any documentation
on that?




--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/Asynchronous-JMS-Consumer-for-IBM-MQ-tp3919p5715.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: Regarding Nifi packaging and deployment

2015-12-10 Thread Joe Witt
Shweta,

The primary mechanism is flow templates [1].

They do have some important limitations today though that you'll want
to understand.  First, some properties are sensitive, like passwords,
and thus are not included in the templates so you'll have to reenter
them when you apply the template in the new environment.  Second,
there are at times properties you'd want to have different values for
in different environments.  We need to provide a property/env variable
mapping mechanism.  Both of these we intend to address but neither are
presently actively being worked as far as I'm aware of.

[1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#templates

Thanks
Joe

On Thu, Dec 10, 2015 at 9:25 PM, shweta  wrote:
> Hi all,
>
> I'm new to Nifi. I have created some sample flows. I want to know how can I
> package and deploy the same
> from development environment to testing environment or do I need to recreate
> the entire data flow again in different environment.
>
> Thanks,
> Shweta
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Regarding-Nifi-packaging-and-deployment-tp5716.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: Asynchronous JMS Consumer for IBM MQ

2015-12-10 Thread Joe Witt
Ian,

With run duration the idea is that the processor will be allowed to
keep executing for that period of time and the framework will keep
giving it the same process session.  For the developer this means they
get to keep their logic very simple and discrete to a single operation
but that the framework will take care of batching those operations
together as one for up to 'X secs' of run duration.

For event driven the idea is that rather than telling the framework
you want the processor to run every X units of time as is the case in
timer driven with event driven the framework will execute the
processor (give it thread/call ontrigger) whenever there is data being
placed into its queue.  It can be more efficient in some cases.

Thanks
Joe

On Thu, Dec 10, 2015 at 8:59 PM, ianwork  wrote:
> Bryan/Aldrin, Adding yielding into my processor prevent the number of tasks
> was rapidly increasing. Thanks!
>
> Aldrin, I would like to dig a little more into the details. My application
> is basically set do process logs like logstash.  The application is reading
> and parsing a high volume of logs. The listener is based up listenSyslog.
> The listener forwards the logs to various processors which run regex's on
> batches of logs. Increasing the number of regex processors reduces the
> performance so i'd like to determine how I can configure the system
> resources.
>
> I'm still struggling with run duration.  Does setting run duration mean that
> when a thread is allocated to the ontrigger method of that processor it will
> run for a maximum of run duration? What if the method executes faster than
> the run duration, will ontrigger be called again if there is work to be
> done?
>
>
> Is the event driven mode something to consider in my type of processor?
> What use cases was that designed to satisfy and is there any documentation
> on that?
>
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Asynchronous-JMS-Consumer-for-IBM-MQ-tp3919p5715.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Regarding Nifi packaging and deployment

2015-12-10 Thread shweta
Hi all,

I'm new to Nifi. I have created some sample flows. I want to know how can I
package and deploy the same 
from development environment to testing environment or do I need to recreate
the entire data flow again in different environment.

Thanks,
Shweta



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/Regarding-Nifi-packaging-and-deployment-tp5716.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Questions about the ordering of the FlowFile.

2015-12-10 Thread Paresh Shah
Here’s my use case.
We have a application protocol between the start and end processors in a data 
flow, that expect the flow files to arrive in the order they are generated. For 
e.g

Start Record Flowfile

End Record Flowfile.

The first processor does the following.

  1.  Generates and transfers the StartRecord flow file.
  2.  Generates data records and transfers them.
  3.  Generates and transfers the EndRecord flow file

The last processor in the data flow does the following.

  1. Looks for the StartRecord flow file and does its thing.
  2. Looks for the DataRecord flow file and does its thing.
  3.  Looks for the EndRecord flow file and updates and cleanups up the 
target state.

The first processor is doing multiple transfers on the session object before 
calling commit.

We see that they are being received in random order. As a result we are not 
able to execute the app protocol. We have tried the FirstInFirstOutPrioritizer 
and OldestFlowFilePrioritizer.

We would appreciate any insights into this we can get as it seems to be a 
blocking issue for us.

Thanks
Paresh

The information contained in this transmission may contain privileged and 
confidential information. It is intended only for the use of the person(s) 
named above. If you are not the intended recipient, you are hereby notified 
that any review, dissemination, distribution or duplication of this 
communication is strictly prohibited. If you are not the intended recipient, 
please contact the sender by reply email and destroy all copies of the original 
message.



Re: Facing Issue while connecting with HDFS

2015-12-10 Thread Bryan Bende
Site-to-Site is a direct connection between NiFi instances/clusters over a
socket, so TCP based.

There will always have to be at least one local machine involved. When NiFi
pulls/receives data from somewhere, it takes that data under control and
stores it in the NiFi content repository on disk (configured in
nifi.properties). As a FlowFile moves through the flow, a pointer to this
content is being passed around until it needs to be accessed. So when
PutHDFS needs to send to the other cluster it would read the content and
send to the other HDFS. The data would then eventually age-off from the
NiFi content repository depending how it is configured. So it would not
have to hold all of the data on the local machine, but it would always have
some portion of the most recent data that has been moved across.

Let us know if this doesn't make sense.

-Bryan




On Thu, Dec 10, 2015 at 1:52 AM, digvijayp 
wrote:

> Hi Bryan,
> So in edge node approach how data sent in site-to-site ?I mean to say is it
> using any protocol to transfer it like FTP,SFTP.
> As you are saying If both clusters can fully talk to each other than you
> don't need this edge node approach, you could just have a NiFi instance, or
> cluster, that pulls from one HDFS and pushes to the other.
> so my query is we have to use FetchHDFS/getHDFS process which get data from
> HDFS to local machine and putHDFS process which load data from local
> machine
> to HDFS.I dont have yo use the local machin in between .So how can we
> manage
> the transfer data without using local machine? Where can we do such
> configuration in nifi?
>
> Thanks in advance.
>
> Digvijay P.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5712.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: GetHTTP processor not working

2015-12-10 Thread Joe Percivall
Hello Shweta,

I think there is a combination of things going on. The error you're probably 
seeing first is "Illegal character in fragment at index 239". This is due to 
the "{" and "}" in your URL. They both need to be URL encoded to %7B and %7D 
respectively. The URL you should be using is below [1].


Second, are you sure the website is running? I tried to reach out to it and my 
connection times out, specifically "Read timed out".

[1] 
http://unify.impetus.co.in/BigData/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx?RootFolder=%2FBigData%2FShared%20Documents%2F3%20CU%2FSkillset%20Analyzer%2FResumes=0x012000D7E70BB8AE01E840A767ECB4D05AC5ED=%7B107FFCED-34CD-4354-B0D3-422058A26150%7D

 
Joe

- - - - - - 
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com




On Thursday, December 10, 2015 7:18 AM, shweta  wrote:



I have a url as following

http://unify.impetus.co.in/BigData/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx?RootFolder=%2FBigData%2FShared%20Documents%2F3%20CU%2FSkillset%20Analyzer%2FResumes=0x012000D7E70BB8AE01E840A767ECB4D05AC5ED={107FFCED-34CD-4354-B0D3-422058A26150}

and I'm trying to fetch some files from above URL using GetHTTP processor.
But it fails. I have tried with decoded URL as well but no luck.

Can anyone please help how to go about it.




--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/GetHTTP-processor-not-working-tp5711.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.