[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-985:
---

Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed])

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, directoryBrowse_0.20yahoo_2.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_trunk3.patch, iterativeLS_trunk3.patch, iterativeLS_trunk4.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-19 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I've just committed this.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, directoryBrowse_0.20yahoo_2.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_trunk3.patch, iterativeLS_trunk3.patch, iterativeLS_trunk4.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-17 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Status: Patch Available  (was: Open)

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, directoryBrowse_0.20yahoo_2.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_trunk3.patch, iterativeLS_trunk3.patch, iterativeLS_trunk4.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-17 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Status: Open  (was: Patch Available)

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, directoryBrowse_0.20yahoo_2.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_trunk3.patch, iterativeLS_trunk3.patch, iterativeLS_trunk4.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-17 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk4.patch

This fixes a bug in TestFileStatus unit test.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, directoryBrowse_0.20yahoo_2.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_trunk3.patch, iterativeLS_trunk3.patch, iterativeLS_trunk4.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-17 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: directoryBrowse_0.20yahoo_2.patch

This patch synced with yahoo 0.20 security branch.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, directoryBrowse_0.20yahoo_2.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_trunk3.patch, iterativeLS_trunk3.patch, iterativeLS_yahoo.patch, 
> iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-17 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk3.patch

Log4j change was not intended. This patch removes it.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, iterativeLS_trunk.patch, 
> iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, iterativeLS_trunk3.patch, 
> iterativeLS_trunk3.patch, iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, 
> testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-16 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk3.patch

This patch incorporated Suresh's review comments. It throws 
FileNotFoundException when the listing directory is deleted and it adds an 
aspectJ test to test this case. Comment 3 needs to be fixed in Mapreduce which 
I will do it later.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, iterativeLS_trunk.patch, 
> iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, iterativeLS_trunk3.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-16 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Status: Patch Available  (was: Open)

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, iterativeLS_trunk.patch, 
> iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, iterativeLS_trunk3.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-16 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: directoryBrowse_0.20yahoo_1.patch

This patch fixes the indention problem and returns null if the target directory 
is deleted before the full listing has been fetched.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> directoryBrowse_0.20yahoo_1.patch, iterativeLS_trunk.patch, 
> iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, iterativeLS_yahoo.patch, 
> iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-11 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk2.patch

iterativeLS_trunk2.patch fixed a bug in TestFileStatus.java.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_trunk2.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-11 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: directoryBrowse_0.20yahoo.patch

This patch fixes the UI bug when browse directory from web in yahoo 0.20  
branch.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: directoryBrowse_0.20yahoo.patch, 
> iterativeLS_trunk.patch, iterativeLS_trunk1.patch, iterativeLS_yahoo.patch, 
> iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-11 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk1.patch

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_trunk.patch, iterativeLS_trunk1.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-11 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: (was: iterativeLS_trunk.patch)

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_trunk.patch, iterativeLS_yahoo.patch, 
> iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-11 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk.patch

This patch is for trunk and fixed a web UI bug when display a directory 
structure.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_trunk.patch, iterativeLS_trunk.patch, 
> iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-10 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_trunk.patch

Here is the patch for the trunk.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_trunk.patch, iterativeLS_yahoo.patch, 
> iterativeLS_yahoo1.patch, testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-01 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: testFileStatus.patch

This patch fixed a bug in TestFileStatus.java.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, 
> testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-02-25 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_yahoo1.patch

This patch addressed Suresh's comments except for comments 3, 5, and 6.
1. rename lastReturnedName to be startAfter;
2. rename PathPartialListing to be DirectoryListing;

in additon, I defined the config property dfs.ls.limit and its default value to 
constants and add comments to DistributedFileSystem#listStatus that explains 
the operation is no longer atomic.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-02-25 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: iterativeLS_yahoo.patch

A patch for review. I am sorry that this patch is generated against the 0.20 
Yahoo! branch because I do not have time to work on the trunk first. But I will 
work on a patch against the trunk for sure before resolving this issue. This 
patch has a few minor changes to the proposal in the jira description.

1. The upper limit of each getListing RPC is made configurable with a default 
value of 1000. This configuration property is undocumented for now. One reason 
to have this is for easy writing unit tests.  AlsoI still need to conduct more 
experiments at a large scale to decide what's the right default number.
2. A getListing RPC returns the number of remaining entries to be listed 
instead of a flag to indicate if there are more to be listed. The number of 
remaining entries could provide a heuristic to decide the initial size of the 
status array at the client size, thus reducing unnecessary 
allocation/deallocation in most cases.

Unit tests are added to TestFileStatus to check if multiple RPCs work fine or 
not.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_yahoo.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.