[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Resolution: Not A Problem Status: Resolved (was: Patch Available) > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, > HDFS-5546.2.004.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-5546: --- Labels: BB2015-05-TBR (was: ) > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, > HDFS-5546.2.004.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-5546: --- Fix Version/s: (was: 3.0.0) > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Lei (Eddy) Xu >Priority: Minor > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, > HDFS-5546.2.004.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Attachment: HDFS-5546.2.004.patch This patch captures {{IOException}} instead of {{FNF}} based on the first patch's logic, as [~daryn] suggested. > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Lei (Eddy) Xu >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, > HDFS-5546.2.004.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Attachment: HDFS-5546.2.003.patch This patch absorts FNF exception but in the end, it prints out a warning message to suggest users to re-run {{ls}}. The print format is something like the following: {quote} drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir0 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir1 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir2 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir3 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir4 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir5 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir6 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir7 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir8 drw-rw-rw- - test test 1024 1969-12-31 16:00 mockfs:///test/dir9 -rw-rw-rw- 1 test test 1024 1969-12-31 16:00 mockfs:///other/file Warning: Files are deleted or renamed during running this command. Suggest to re-run this command. {quote} Actually, all {{/test/dir#}} are deleted, but they are printed anyway. The reason is that {{ls}} firstly prints out the current directory, then jumps into the sub-directory recursively, there is no cheap way to test the existence of the current directory before printing its information. I think in this case, we should not directly catch IOExecption and {{displayError(e)}}. {{LS}} command should be tolerate this scenario without generating too much noise. > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Lei (Eddy) Xu >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Attachment: HDFS-5546.2.002.patch Move {{globStatus()}} back to TestLS. > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Fix Version/s: 3.0.0 Status: Patch Available (was: Open) This patch catches FileNotFound exception during {{ls}} execution and ignores it, to handle a case that there is deletion in the sub-namespace. Unit tests are included. It is a _best effort_ to finish the `ls` execution. Thus it could not discover newly changes on the directory that is currently being iterated. E.g., the case of renaming {{/foo/bar}} to {{/foo/zoo}}, when running {{ls /foo}} is not handled. That is, in such case, {{/foo/bar}} is considered _deleted_, but the {{/foo/zoo}} is not visible to the current execution. > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Attachment: HDFS-5546.2.001.patch Hey [~cmccabe] This patch includes a unit test that deletes a fraction of sub-directories in the middle of listing the parent directory. In the end, this patch verifies the rest of the directory are finished even there is one or more FNF in the process. > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Kousuke Saruta >Priority: Minor > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5546: Attachment: HDFS-5546.2.000.patch [~cmccabe] This new patch is just catching the FNF exception right on {{getDirectoryContent()}} for {{ls/lsr}} command. Could you take a look of it? Thanks! > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Kousuke Saruta >Priority: Minor > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5546: -- Assignee: Kousuke Saruta > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Assignee: Kousuke Saruta >Priority: Minor > Attachments: HDFS-5546.1.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HDFS-5546: - Attachment: HDFS-5546.1.patch I've tried to make a patch for this issue. How do you look that? > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Priority: Minor > Attachments: HDFS-5546.1.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HDFS-5546: - Assignee: (was: Kousuke Saruta) > race condition crashes "hadoop ls -R" when directories are moved/removed > > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Patrick McCabe >Priority: Minor > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.1#6144)