[ 
https://issues.apache.org/jira/browse/FALCON-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13913400#comment-13913400
 ] 

Raghav Kumar Gautam edited comment on FALCON-321 at 2/26/14 7:47 PM:
---------------------------------------------------------------------

Feed definition:
{code:xml}
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="inPath-4b6119d5-7f99-4e19-8bcf-5da94d7afb19" description="clicks 
log"
      xmlns="uri:falcon:feed:0.1">
    <frequency>hours(1)</frequency>
    <timezone>UTC</timezone>
    <late-arrival cut-off="hours(6)"/>
    <clusters>
        <cluster name="corp-92289f2b-bb08-42f4-a488-806cf40c19f7" type="source">
            <validity start="2012-01-30T00:00Z" end="2099-03-31T23:59Z"/>
            <retention limit="hours(24)" action="delete"/>
        </cluster>
    </clusters>
    <locations>
        <location type="data"
                  
path="/tmp/falcon-regression/RetentionTest/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}"/>
        <location type="stats" path="/projects/ivory/clicksStats"/>
        <location type="meta" path="/projects/ivory/clicksMetaData"/>
    </locations>
    <ACL owner="testuser" group="group" permission="0x755"/>
    <schema location="/schema/clicks" provider="protobuf"/>
    <properties/>
</feed>
{code}

List of directories. These directories don't contain any file.
{noformat}
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/00
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/05
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/10
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/15
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/20
...
/tmp/falcon-regression/RetentionTest/testFolders/2014/04/03/15
/tmp/falcon-regression/RetentionTest/testFolders/2014/04/03/20
{noformat}

The oozie job was killed and job log had following error.
{noformat}
Failing Oozie Launcher, Main class [org.apache.falcon.retention.FeedEvictor], 
main() threw exception, Unable to delete instance: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/05
java.io.IOException: Unable to delete instance: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/05
        at 
org.apache.falcon.retention.FeedEvictor.deleteInstance(FeedEvictor.java:321)
        at 
org.apache.falcon.retention.FeedEvictor.fileSystemEvictor(FeedEvictor.java:174)
        at org.apache.falcon.retention.FeedEvictor.evictFS(FeedEvictor.java:149)
        at org.apache.falcon.retention.FeedEvictor.evict(FeedEvictor.java:139)
        at org.apache.falcon.retention.FeedEvictor.run(FeedEvictor.java:121)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.falcon.retention.FeedEvictor.main(FeedEvictor.java:93)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
{noformat}

And the following in the log:
{noformat}
org.apache.falcon.retention.FeedEvictor: Applying retention on 
DATA=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/tmp/falcon-regression/RetentionTest/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}#META=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksMetaData#STATS=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksStats#TMP=/tmp
 type: instance, Limit: hours(24), timezone: UTC, frequency: hours, 
storageFILESYSTEM
org.apache.falcon.retention.FeedEvictor: Normalized path : 
/tmp/falcon-regression/RetentionTest/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}
org.apache.falcon.retention.FeedEvictor: Searching for 
/tmp/falcon-regression/RetentionTest/testFolders/*/*/*/*
org.apache.falcon.retention.FeedEvictor: Deleted instance 
:/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/00
org.apache.falcon.retention.FeedEvictor: Parent path: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21 is empty, deleting 
path
org.apache.falcon.retention.FeedEvictor: Deleted empty dir: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21
org.apache.falcon.retention.FeedEvictor: Parent path: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01 is empty, deleting path
org.apache.falcon.retention.FeedEvictor: Deleted empty dir: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01
org.apache.falcon.retention.FeedEvictor: Parent path: 
/tmp/falcon-regression/RetentionTest/testFolders/2014 is empty, deleting path
org.apache.falcon.retention.FeedEvictor: Deleted empty dir: 
/tmp/falcon-regression/RetentionTest/testFolders/2014
org.apache.falcon.retention.FeedEvictor: Not deleting feed base 
path:/tmp/falcon-regression/RetentionTest/testFolders
{noformat}


was (Author: raghavgautam):
Feed definition:
{code:xml}
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><feed 
name="inPath-4b6119d5-7f99-4e19-8bcf-5da94d7afb19" description="clicks log" 
xmlns="uri:falcon:feed:0.1"><frequency>hours(1)</frequency><timezone>UTC</timezone><late-arrival
 cut-off="hours(6)"/><clusters><cluster 
name="corp-92289f2b-bb08-42f4-a488-806cf40c19f7" type="source"><validity 
start="2012-01-30T00:00Z" end="2099-03-31T23:59Z"/><retention limit="hours(24)" 
action="delete"/></cluster></clusters><locations><location type="data" 
path="/tmp/falcon-regression/RetentionTest/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}"/><location
 type="stats" path="/projects/ivory/clicksStats"/><location type="meta" 
path="/projects/ivory/clicksMetaData"/></locations><ACL owner="testuser" 
group="group" permission="0x755"/><schema location="/schema/clicks" 
provider="protobuf"/><properties/></feed>
{code}

List of directories. These directories don't contain any file.
{noformat}
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/00
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/05
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/10
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/15
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/20
...
/tmp/falcon-regression/RetentionTest/testFolders/2014/04/03/15
/tmp/falcon-regression/RetentionTest/testFolders/2014/04/03/20
{noformat}

The oozie job was killed and job log had following error.
{noformat}
Failing Oozie Launcher, Main class [org.apache.falcon.retention.FeedEvictor], 
main() threw exception, Unable to delete instance: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/05
java.io.IOException: Unable to delete instance: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/05
        at 
org.apache.falcon.retention.FeedEvictor.deleteInstance(FeedEvictor.java:321)
        at 
org.apache.falcon.retention.FeedEvictor.fileSystemEvictor(FeedEvictor.java:174)
        at org.apache.falcon.retention.FeedEvictor.evictFS(FeedEvictor.java:149)
        at org.apache.falcon.retention.FeedEvictor.evict(FeedEvictor.java:139)
        at org.apache.falcon.retention.FeedEvictor.run(FeedEvictor.java:121)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.falcon.retention.FeedEvictor.main(FeedEvictor.java:93)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
{noformat}

And the following in the log:
{noformat}
org.apache.falcon.retention.FeedEvictor: Applying retention on 
DATA=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/tmp/falcon-regression/RetentionTest/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}#META=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksMetaData#STATS=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksStats#TMP=/tmp
 type: instance, Limit: hours(24), timezone: UTC, frequency: hours, 
storageFILESYSTEM
org.apache.falcon.retention.FeedEvictor: Normalized path : 
/tmp/falcon-regression/RetentionTest/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}
org.apache.falcon.retention.FeedEvictor: Searching for 
/tmp/falcon-regression/RetentionTest/testFolders/*/*/*/*
org.apache.falcon.retention.FeedEvictor: Deleted instance 
:/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21/00
org.apache.falcon.retention.FeedEvictor: Parent path: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21 is empty, deleting 
path
org.apache.falcon.retention.FeedEvictor: Deleted empty dir: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01/21
org.apache.falcon.retention.FeedEvictor: Parent path: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01 is empty, deleting path
org.apache.falcon.retention.FeedEvictor: Deleted empty dir: 
/tmp/falcon-regression/RetentionTest/testFolders/2014/01
org.apache.falcon.retention.FeedEvictor: Parent path: 
/tmp/falcon-regression/RetentionTest/testFolders/2014 is empty, deleting path
org.apache.falcon.retention.FeedEvictor: Deleted empty dir: 
/tmp/falcon-regression/RetentionTest/testFolders/2014
org.apache.falcon.retention.FeedEvictor: Not deleting feed base 
path:/tmp/falcon-regression/RetentionTest/testFolders
{noformat}

> Feed evictor deleting more stuff than it should
> -----------------------------------------------
>
>                 Key: FALCON-321
>                 URL: https://issues.apache.org/jira/browse/FALCON-321
>             Project: Falcon
>          Issue Type: Bug
>            Reporter: Raghav Kumar Gautam
>            Priority: Blocker
>              Labels: system-tests
>
> In FeedEvictor.java we have:
> {code:java}
> private void deleteParentIfEmpty(FileSystem fs, Path parent, Path 
> feedBasePath) throws IOException {
>         if (feedBasePath.equals(parent)) {
>             LOG.info("Not deleting feed base path:" + parent);
>         } else {
>             if (fs.getContentSummary(parent).getFileCount() == 0) {
>                 LOG.info("Parent path: " + parent + " is empty, deleting 
> path");
>                 if (fs.delete(parent, true)) {
>                     LOG.info("Deleted empty dir: " + parent);
>                 } else {
>                     throw new IOException("Unable to delete parent path:" + 
> parent);
>                 }
>                 deleteParentIfEmpty(fs, parent.getParent(), feedBasePath);
>             }
>         }
>     }
> {code}
> In the fs.getContentSummary(parent).getFileCount() call if the parent has no 
> files but has directories then we delete the parent directory. Which is 
> incorrect.
> Here is log from falcon-regression's RetentionTest.testRetention(parameters: 
> hours, 24, true, daily) :
> {noformat}
> 2014-02-24 15:09:45,034 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Applying retention on 
> DATA=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/retention/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}#META=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksMetaData#STATS=hdfs://raghav5-falcon-5.cs1cloud.internal:8020/projects/ivory/clicksStats#TMP=/tmp
>  type: instance, Limit: hours(24), timezone: UTC, frequency: hours, 
> storageFILESYSTEM
> 2014-02-24 15:09:45,051 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Normalized path : /retention/testFolders/${YEAR}/${MONTH}/${DAY}/${HOUR}
> 2014-02-24 15:09:45,123 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Searching for /retention/testFolders/*/*/*/*
> 2014-02-24 15:09:45,486 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Deleted instance :/retention/testFolders/2014/01/21/00
> 2014-02-24 15:09:45,500 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Parent path: /retention/testFolders/2014/01/21 is empty, deleting path
> 2014-02-24 15:09:45,509 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Deleted empty dir: /retention/testFolders/2014/01/21
> 2014-02-24 15:09:45,511 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Parent path: /retention/testFolders/2014/01 is empty, deleting path
> 2014-02-24 15:09:45,517 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Deleted empty dir: /retention/testFolders/2014/01
> 2014-02-24 15:09:45,518 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Parent path: /retention/testFolders/2014 is empty, deleting path
> 2014-02-24 15:09:45,525 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Deleted empty dir: /retention/testFolders/2014
> 2014-02-24 15:09:45,526 INFO [main] org.apache.falcon.retention.FeedEvictor: 
> Not deleting feed base path:/retention/testFolders
> {noformat}
> Stacktrace:
> {noformat}
> Failing Oozie Launcher, Main class [org.apache.falcon.retention.FeedEvictor], 
> main() threw exception, Unable to delete instance: 
> /retention/testFolders/2014/01/21/03
> java.io.IOException: Unable to delete instance: 
> /retention/testFolders/2014/01/21/03
>       at 
> org.apache.falcon.retention.FeedEvictor.deleteInstance(FeedEvictor.java:321)
>       at 
> org.apache.falcon.retention.FeedEvictor.fileSystemEvictor(FeedEvictor.java:174)
>       at org.apache.falcon.retention.FeedEvictor.evictFS(FeedEvictor.java:149)
>       at org.apache.falcon.retention.FeedEvictor.evict(FeedEvictor.java:139)
>       at org.apache.falcon.retention.FeedEvictor.run(FeedEvictor.java:121)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.falcon.retention.FeedEvictor.main(FeedEvictor.java:93)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to