Raghav Kumar Gautam created FALCON-321:
------------------------------------------
Summary: Feed evictor deleting more stuff than it should
Key: FALCON-321
URL: https://issues.apache.org/jira/browse/FALCON-321
Project: Falcon
Issue Type: Bug
Reporter: Raghav Kumar Gautam
In FeedEvictor.java we have:
<code: java>
private void deleteParentIfEmpty(FileSystem fs, Path parent, Path feedBasePath)
throws IOException {
if (feedBasePath.equals(parent)) {
LOG.info("Not deleting feed base path:" + parent);
} else {
if (fs.getContentSummary(parent).getFileCount() == 0) {
LOG.info("Parent path: " + parent + " is empty, deleting path");
if (fs.delete(parent, true)) {
LOG.info("Deleted empty dir: " + parent);
} else {
throw new IOException("Unable to delete parent path:" +
parent);
}
deleteParentIfEmpty(fs, parent.getParent(), feedBasePath);
}
}
}
</code>
In the fs.getContentSummary(parent).getFileCount() call if the parent has no
files but has directories then we delete the parent directory. Which is
incorrect.
Here is log from falcon-regression's RetentionTest.testRetention(parameters:
hours, 24, true, daily) :
<quote>
2014-02-24 15:09:45,034 INFO [main] org.apache.falcon.retention.FeedEvictor:
Applying retention on
DATA=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/retention/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}#META=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksMetaData#STATS=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksStats#TMP=/tmp
type: instance, Limit: hours(24), timezone: UTC, frequency: hours,
storageFILESYSTEM
2014-02-24 15:09:45,051 INFO [main] org.apache.falcon.retention.FeedEvictor:
Normalized path : /retention/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}
2014-02-24 15:09:45,123 INFO [main] org.apache.falcon.retention.FeedEvictor:
Searching for /retention/testFolders/*/*/*/*
2014-02-24 15:09:45,486 INFO [main] org.apache.falcon.retention.FeedEvictor:
Deleted instance :/retention/testFolders/2014/01/21/00
2014-02-24 15:09:45,500 INFO [main] org.apache.falcon.retention.FeedEvictor:
Parent path: /retention/testFolders/2014/01/21 is empty, deleting path
2014-02-24 15:09:45,509 INFO [main] org.apache.falcon.retention.FeedEvictor:
Deleted empty dir: /retention/testFolders/2014/01/21
2014-02-24 15:09:45,511 INFO [main] org.apache.falcon.retention.FeedEvictor:
Parent path: /retention/testFolders/2014/01 is empty, deleting path
2014-02-24 15:09:45,517 INFO [main] org.apache.falcon.retention.FeedEvictor:
Deleted empty dir: /retention/testFolders/2014/01
2014-02-24 15:09:45,518 INFO [main] org.apache.falcon.retention.FeedEvictor:
Parent path: /retention/testFolders/2014 is empty, deleting path
2014-02-24 15:09:45,525 INFO [main] org.apache.falcon.retention.FeedEvictor:
Deleted empty dir: /retention/testFolders/2014
2014-02-24 15:09:45,526 INFO [main] org.apache.falcon.retention.FeedEvictor:
Not deleting feed base path:/retention/testFolders
</quote>
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)