----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18626/ -----------------------------------------------------------
Review request for Falcon and Srikanth Sundarrajan. Summary (updated) ----------------- FALCON-284: Hcatalog based feed retention doesn't work when partition filter spans across multiple partition keys Repository: falcon-git Description (updated) ------- When an HCatalog based feed is scheduled in falcon, retention only looks at the first partition key that satisfies either of date pattern: yyyy | MM | dd | HH | mm. As a result, it calculates a partition filter that contains only one of these patterns. However if HCatalog table is defined in such a way that date spans across multiple partition keys (year/month/day/hour/minute), then feed retention doesn't delete any partitions that are granular than first level (year). Diffs (updated) ----- common/src/main/java/org/apache/falcon/catalog/AbstractCatalogService.java fc9c3b1 common/src/main/java/org/apache/falcon/catalog/HiveCatalogService.java 3c3660e common/src/main/java/org/apache/falcon/entity/common/FeedDataPath.java 4031e14 retention/src/main/java/org/apache/falcon/retention/FeedEvictor.java 13c447c webapp/src/test/java/org/apache/falcon/lifecycle/TableStorageFeedEvictorIT.java 770780e Diff: https://reviews.apache.org/r/18626/diff/ Testing (updated) ------- - Added new integration tests in TableStorageFeedEvictorIT.java to test retention for an Hcatalog feed where date consists of multiple partitions columns (year/month/day). - Verified the retention behavior on a test cluster having an Hcatalog based feed partitioned by year/month/day/hour/minute/country. Thanks, Satish Mittal
