> On Oct. 28, 2015, 11:46 p.m., Sowmya Ramesh wrote: > > common/src/main/java/org/apache/falcon/entity/FeedHelper.java, line 813 > > <https://reviews.apache.org/r/39711/diff/4/?file=1111629#file1111629line813> > > > > Sorry, for multiple comments. I didn't review Lifecycle feature so I > > didn't have the complete picture. > > > > Frequency in the retention stage is not mandatory and if teh frequency > > is not set by user then > > 1> If feed frequency < 6 hrs its set to 6 hrs > > 2> If its > 6 hrs its set to feed frequency > > > > Shouldn't it fallbaack to current behavior for retenting the data? < > > 6hrs set to 6 hrs and > 6hrs set to 1 day? > > > > This is required for 2 reasons > > 1> Current understanding of users is that if feed frequency > 6 hrs , > > retention job will run every day. We shouldn't deviate from this. > > > > 2> I also spoke with Venkatesh about why was it set to 1 day. He > > mentioned in case retention fails and reruns fail too we don't want to keep > > the data till it runs next time if feed frequency is used. This can cause > > SEC retention vioalation and also cause memory issues if feed frequency is > > say one year. If job runs every day it catches up for the scenario > > mentioned above. > > > > Any specific reason to change the old behavior?
Sowmya and I had an offline discussion to address this. Updating the gist here. We try to fall back to old behaviour as much as possible but it fails the extra validations in lifecycle retention. The current behaviour is to retain old behaviour as much as possible within new constraints (specifically retention shouldn't be more frequent than data availability). Keeping retention frequency as a fallback to retries is not the best thing to do in such scenarios. If it fails all retries there is no guarantee that it will succeed next time as well. It means system is not able to recover on it's own and needs manual intervention. Best way to deal with such scenarios is to have appropriate monitoring and alerting (e.g. they can now have email alerts on failure of retention workflow). The said kind of set up also fails for a majority of frequencies e.g. minutely, hourly, daily (all apart from roll ups like monthly) will not ensure the above guarantee from the reasoning mentioned. So the guarantee is already broken, if it was ever the intent. Also, the above behaviour is a wastage of resources 99% of the times to solve for that rare 1% case. Coordinators will run and they will have nothing to do. - Ajay ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/39711/#review104372 ----------------------------------------------------------- On Oct. 28, 2015, 6:04 p.m., Ajay Yadava wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/39711/ > ----------------------------------------------------------- > > (Updated Oct. 28, 2015, 6:04 p.m.) > > > Review request for Falcon. > > > Bugs: FALCON-1560 > https://issues.apache.org/jira/browse/FALCON-1560 > > > Repository: falcon-git > > > Description > ------- > > Lifecycle does not allow feed with frequency greater than days(1) > > > Diffs > ----- > > common/src/main/java/org/apache/falcon/entity/FeedHelper.java 5c252a8 > common/src/test/java/org/apache/falcon/entity/FeedHelperTest.java 4020d36 > > common/src/test/java/org/apache/falcon/entity/parser/FeedEntityParserTest.java > 905be68 > > Diff: https://reviews.apache.org/r/39711/diff/ > > > Testing > ------- > > Added unit test for the scenarios. > > > Thanks, > > Ajay Yadava > >
