In the case of bulk transferrs, please also consider to induce a random delay in the range of 0-x, specifiable in seconds/minutes so oozie/hadoop is not fired up together. I thought this was already in falcon but nice to see it being considered. BTW, I think back port term is generally used for making feature available in older or earlier releases :-)
Cheers On Jul 20, 2013 1:37 PM, "Srikanth Sundarrajan (JIRA)" <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/FALCON-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714382#comment-13714382] > > Srikanth Sundarrajan commented on FALCON-47: > -------------------------------------------- > > {quote} > Can you please open this jira so we do not lose it in the comments. > {quote} > > FALCON-51 created to track this. > > > Falcon Replication should support configurable delays in feed, parallel, > timeout and bulk transfer with variable frequency > > > -------------------------------------------------------------------------------------------------------------------------- > > > > Key: FALCON-47 > > URL: https://issues.apache.org/jira/browse/FALCON-47 > > Project: Falcon > > Issue Type: New Feature > > Affects Versions: 0.3 > > Environment: oozie 3.x distcp 2.x (custom) > > Reporter: Shaik Idris Ali > > Labels: Replication > > Attachments: Falcon-47.patch, Falcon-47-v2.patch > > > > > > Falcon Replication should support configurable delays in feed. > > By default the replication/distcp works without any delay, i.e. as soon > as a feed is scheduled, it will start replicating with the current nominal > time. > > 1. We need to support usecase of delayed replication, for example, a > feed can be produced with a delay of n hours based on the process > generating it, however the replication kicks of immediately and oozie goes > into waiting state and might timeout. > > 2. We need to support parallel/concurrency for feed replication by > capturing it from properties, user may want to run distcp parallelly to > backfill data on another cluster. > > 3. timeout can also be set as special property. > > All these can be back-ported from Ivory release. > > 4. Need to support replication with a frequency different from the feed > frequency, ex: a feed can have hours(1) as frequency, by default a > scheduled feed will distcp every hour, however users should be able to set > larger frequency like hours(6) which bulk replicates 6 hours of data in one > distcp operation. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira >
