Hi Aravind,

You mentioned “You may be better off using something like an Oozie action to 
trigger a job when the dataset is complete. ”  - But how will I know that flume 
completed its transfer (moreover we want this to happen at regular intervals)

Thanks,
Manohar

From: Arvind Prabhakar [mailto:[email protected]]
Sent: Monday, December 8, 2014 11:42 AM
To: [email protected]
Subject: Re: Notification support from flume?

Flume is not suited for file transfers as such. With that, please see my 
comments below:

- support for variable transaction size that could be set by the source or 
interceptor

The transactions are already variable sized. The only configuration that 
applies on top is the maximum size of a transaction. How is this different from 
what you are proposing?


 - SpoolDir to support creation of one transaction per file

If the file is large, you would run out of heap space quickly. Also, how do you 
recover from intermittent failures?

 - File and Memory channels to support spawning a process on transaction 
successful commit. Such process can be a bash script, but that would be 
implemented in plug-able class

You may be better off using something like an Oozie action to trigger a job 
when the dataset is complete.

Regards,
Arvind







On Sun, Dec 7, 2014 at 12:55 PM, Ahmed Vila 
<[email protected]<mailto:[email protected]>> wrote:
Hi group,

Manohar's requirements sound valid. Guess there are other cases such 
"completion notification" could come in handy.

Thus, I would propose these distinct features that would make this possible via 
configuration:
 - support for variable transaction size that could be set by the source or 
interceptor
 - SpoolDir to support creation of one transaction per file
 - File and Memory channels to support spawning a process on transaction 
successful commit. Such process can be a bash script, but that would be 
implemented in plug-able class

The one thing I'm not sure about until I look at the code, if HDFSSink will 
write flush cache to the HDFS once it encounters no more events in a 
transaction.

What do you guys think ?


On Sat, Dec 6, 2014 at 7:31 AM, Manohar CS 
<[email protected]<mailto:[email protected]>> wrote:

Thanks Hari for your response.



My requirement goes like this -



1) There are bunch of files coming in at regular intervals (hourly or daily) in 
my spoolDir

2) I wan tthem to be moved into HDFS via HDFS sink using reg-ex like 
/target/%Y-%M%D so each day file gets into different destination HDFS

3) Now once this flume completes copying files , I want to kick off my MR job.



Thanks,

Manohar

________________________________
From: Hari Shreedharan 
<[email protected]<mailto:[email protected]>>
Sent: Saturday, December 6, 2014 7:16 AM
To: [email protected]<mailto:[email protected]>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: Notification support from flume?

Looking at .COMPLETED is not an indication that the data has been written out 
to HDFS. As of now, unfortunately there is no way to tag an event as coming 
from a specific file. I can’t think of a way to do this in a fool-proof way off 
the top of my mind. What is your use-case, there might be another way to do the 
same thing?

Thanks,
Hari


On Fri, Dec 5, 2014 at 4:19 AM, Manohar CS 
<[email protected]<mailto:[email protected]>> wrote:
Hi All,


I wanted to know if there is a way of notification mechanism or some way of 
finding out if flume has finished transfer of certain file from spoolDir to 
HDFS sink? We know by looking at .COMPLETED files in spoolDir we can assume its 
completed but wanted to know if there is more reliable way of call back 
mechanism ?




Thanks,
Manohar.


Please consider the environment before printing this e-mail

Disclaimer: This  communication  is  for the exclusive use of the intended 
recipient(s) and  shall  not attach any liability on the originator or ITC 
Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group 
Companies. If you are the addressee, the contents of this e-mail are intended 
for your use only and it shall  not be forwarded to any third party, without 
first obtaining written authorization from the originator or ITC Infotech India 
Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may 
contain information which is confidential and legally privileged and the same 
shall not be used or dealt with  by any  third  party  in  any manner 
whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its 
Holding company/ its Subsidiaries/ its Group Companies.




Please consider the environment before printing this e-mail

Disclaimer: This  communication  is  for the exclusive use of the intended 
recipient(s) and  shall  not attach any liability on the originator or ITC 
Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group 
Companies. If you are the addressee, the contents of this e-mail are intended 
for your use only and it shall  not be forwarded to any third party, without 
first obtaining written authorization from the originator or ITC Infotech India 
Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may 
contain information which is confidential and legally privileged and the same 
shall not be used or dealt with  by any  third  party  in  any manner 
whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its 
Holding company/ its Subsidiaries/ its Group Companies.




--

Best regards,
Ahmed Vila | Senior software developer
DevLogic | Sarajevo | Bosnia and Herzegovina

Office : +387 33 942 123<tel:%2B387%2033%20942%20123>
Mobile: +387 62 139 348<tel:%2B387%2062%20139%20348>

Website: www.devlogic.eu<http://www.devlogic.eu>
E-mail   : [email protected]<mailto:[email protected]>
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. This email contains confidential information. It should not 
be copied, disclosed to, retained or used by, any party other than the intended 
recipient. Any unauthorised distribution, dissemination or copying of this 
E-mail or its attachments, and/or any use of any information contained in them, 
is strictly prohibited and may be illegal. If you are not an intended recipient 
then please promptly delete this e-mail and any attachment and all copies and 
inform the sender directly via email. Any emails that you send to us may be 
monitored by systems or persons other than the named communicant for the 
purposes of ascertaining whether the communication complies with the law and 
company policies.

---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. This email contains confidential information. It should not 
be copied, disclosed to, retained or used by, any party other than the intended 
recipient. Any unauthorised distribution, dissemination or copying of this 
E-mail or its attachments, and/or any use of any information contained in them, 
is strictly prohibited and may be illegal. If you are not an intended recipient 
then please promptly delete this e-mail and any attachment and all copies and 
inform the sender directly via email. Any emails that you send to us may be 
monitored by systems or persons other than the named communicant for the 
purposes of ascertaining whether the communication complies with the law and 
company policies.

Please consider the environment before printing this e-mail

Disclaimer: This  communication  is  for the exclusive use of the intended 
recipient(s) and  shall  not attach any liability on the originator or ITC 
Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group 
Companies. If you are the addressee, the contents of this e-mail are intended 
for your use only and it shall  not be forwarded to any third party, without 
first obtaining written authorization from the originator or ITC Infotech India 
Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may 
contain information which is confidential and legally privileged and the same 
shall not be used or dealt with  by any  third  party  in  any manner 
whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its 
Holding company/ its Subsidiaries/ its Group Companies.

Reply via email to