[jira] Commented: (HDFS-1288) start-all.sh / stop-all.sh does not seem to work with HDFS

2010-07-30 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893978#action_12893978
 ] 

Tom White commented on HDFS-1288:
-

I haven't been able to reproduce this. I successfully ran the following with 
RC0 (HADOOP_HDFS_HOME was not set):

{code}
export HADOOP_HOME=...

$HADOOP_HOME/bin/hadoop namenode -format
$HADOOP_HOME/bin/start-all.sh
$HADOOP_HOME/bin/hdfs dfsadmin -safemode wait
sleep 60
$HADOOP_HOME/bin/hadoop fs -mkdir input
$HADOOP_HOME/bin/hadoop fs -put $HADOOP_HOME/LICENSE.txt input
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-*-examples-*.jar grep \
  input output Apache
$HADOOP_HOME/bin/hadoop fs -cat 'output/part-r-0' | grep Apache
$HADOOP_HOME/bin/stop-all.sh
{code}

Aaron, what did you run to see this problem?

> start-all.sh / stop-all.sh does not seem to work with HDFS
> --
>
> Key: HDFS-1288
> URL: https://issues.apache.org/jira/browse/HDFS-1288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.21.0
>Reporter: Aaron Kimball
>Priority: Blocker
> Fix For: 0.21.0
>
>
> The start-all.sh / stop-all.sh script shipping with the "combined" 
> hadoop-0.21.0-rc1 does not start/stop the DFS daemons unless 
> $HADOOP_HDFS_HOME is explicitly set.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1322) DistributedFileSystem.mkdirs(dir, dirPermission) doesn't set the permissions of created dir to dirPermission

2010-07-30 Thread Ravi Gummadi (JIRA)
DistributedFileSystem.mkdirs(dir, dirPermission) doesn't set the permissions of 
created dir to dirPermission


 Key: HDFS-1322
 URL: https://issues.apache.org/jira/browse/HDFS-1322
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravi Gummadi


DistributedFileSystem.mkdirs(dir, dirPermission) calls DFSClient.mkdirs(dir, 
dirPermission) to create directory and then set permissions to (dirPermission & 
umask). Is this the intended behaviour ? I expect it to set permissions of dir 
to dirPermission instead of doing an AND with umask  similar to what 'chmod 
-m mode' does.

The javadoc of DFSClient.mkdirs() says that the permissions of created dir will 
be set to dirPermission, which is not done currently. This needs to be modified 
based on the actual behaviour.

This is not an issue in RawLocalFileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-202) Add a bulk FIleSystem.getFileBlockLocations

2010-07-30 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894070#action_12894070
 ] 

Hairong Kuang commented on HDFS-202:


As I commented in HADOOP-6890, I would prefer throwing exceptions when a 
file/directory is deleted during listing. This is because getFiles is used by 
MapReduce job client to calculate splits. So the expectation is that the input 
directories remain no change during job execution. It is good to fail the job 
earlier than later.

> Add a bulk FIleSystem.getFileBlockLocations
> ---
>
> Key: HDFS-202
> URL: https://issues.apache.org/jira/browse/HDFS-202
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: hdfsListFiles.patch, hdfsListFiles1.patch
>
>
> Currently map-reduce applications (specifically file-based input-formats) use 
> FileSystem.getFileBlockLocations to compute splits. However they are forced 
> to call it once per file.
> The downsides are multiple:
># Even with a few thousand files to process the number of RPCs quickly 
> starts getting noticeable
># The current implementation of getFileBlockLocations is too slow since 
> each call results in 'search' in the namesystem. Assuming a few thousand 
> input files it results in that many RPCs and 'searches'.
> It would be nice to have a FileSystem.getFileBlockLocations which can take in 
> a directory, and return the block-locations for all files in that directory. 
> We could eliminate both the per-file RPC and also the 'search' by a 'scan'.
> When I tested this for terasort, a moderate job with 8000 input files the 
> runtime halved from the current 8s to 4s. Clearly this is much more important 
> for latency-sensitive applications...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1323) Pool/share file channels for HDFS read

2010-07-30 Thread Jay Booth (JIRA)
Pool/share file channels for HDFS read
--

 Key: HDFS-1323
 URL: https://issues.apache.org/jira/browse/HDFS-1323
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Jay Booth
 Fix For: 0.20-append, 0.22.0


Currently, all reads in HDFS require opening and closing the underlying 
block/meta filechannels.  We could pool these filechannels and save some system 
calls and other work.  Since HDFS read requests can be satisfied by positioned 
reads and transferTos, we can even share these filechannels between 
concurrently executing requests.

The attached patch was benchmarked as part of work on HDFS-918 and exhibited a 
10% performance increase for small random reads.

This does not affect client logic and involves minimal change to server logic.  
Patch is based on branch 20-append. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1323) Pool/share file channels for HDFS read

2010-07-30 Thread Jay Booth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Booth updated HDFS-1323:


Attachment: hdfs-1323-20100730.patch

> Pool/share file channels for HDFS read
> --
>
> Key: HDFS-1323
> URL: https://issues.apache.org/jira/browse/HDFS-1323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Reporter: Jay Booth
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-1323-20100730.patch
>
>
> Currently, all reads in HDFS require opening and closing the underlying 
> block/meta filechannels.  We could pool these filechannels and save some 
> system calls and other work.  Since HDFS read requests can be satisfied by 
> positioned reads and transferTos, we can even share these filechannels 
> between concurrently executing requests.
> The attached patch was benchmarked as part of work on HDFS-918 and exhibited 
> a 10% performance increase for small random reads.
> This does not affect client logic and involves minimal change to server 
> logic.  Patch is based on branch 20-append. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1323) Pool/share file channels for HDFS read

2010-07-30 Thread Jay Booth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894072#action_12894072
 ] 

Jay Booth commented on HDFS-1323:
-

Correction - the patch created a 10% performance increase for HBase random 
GETs.  It was probably a larger % of the read operation, if you don't include 
other work by HBase.

> Pool/share file channels for HDFS read
> --
>
> Key: HDFS-1323
> URL: https://issues.apache.org/jira/browse/HDFS-1323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Reporter: Jay Booth
> Fix For: 0.20-append, 0.22.0
>
> Attachments: hdfs-1323-20100730.patch
>
>
> Currently, all reads in HDFS require opening and closing the underlying 
> block/meta filechannels.  We could pool these filechannels and save some 
> system calls and other work.  Since HDFS read requests can be satisfied by 
> positioned reads and transferTos, we can even share these filechannels 
> between concurrently executing requests.
> The attached patch was benchmarked as part of work on HDFS-918 and exhibited 
> a 10% performance increase for small random reads.
> This does not affect client logic and involves minimal change to server 
> logic.  Patch is based on branch 20-append. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1324) Quantifying HDFS Client Latency to understand performance and scalability

2010-07-30 Thread Mona Chitnis (JIRA)
Quantifying HDFS Client Latency to understand performance and scalability
-

 Key: HDFS-1324
 URL: https://issues.apache.org/jira/browse/HDFS-1324
 Project: Hadoop HDFS
  Issue Type: Test
  Components: benchmarks
Affects Versions: 0.20-append
 Environment: An HDFS cluster of 9 nodes (same rack; 1 rack = 40 nodes 
to one rack switch) was deployed on Yahoo!'s R&D grid cluster. It comprised of 
1 Namenode, 1 JobTracker and 7 Datanodes. client assigned to a separate compute 
node not part of this HDFS. The characteristics of the testbed are as follows:
•   Hadoop 0.20.1xx - latest Yahoo! Hadoop Security version
•   OS - GNU/Linux x86_64 kernel version 2.6.18
•   Java 1.6
•   Processor - 2 SMP Quad-core Intel Xeon @ 2.5Ghz per node
•   Memory - 16GB RAM per node

Reporter: Mona Chitnis


JAR to measure HDFS client latencies. Runs a process comprising of HDFS 
operations with helper scripts to specify number of iterations, number of 
concurrent instances and base directory. Compiles statistics in 2 text files 
inside base directory which can be read/manipulated using data charts 
applications (e.g. MS Excel).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1150:
--

Attachment: HDFS-1150-trunk.patch

Patch for trunk, which seems to be broken at the moment.  Relatively 
straight-forward port of 20 patch.  New command {start,stop}-secure-dns.sh, to 
be executed to by root, to start the secure datanodes.  bin/hdfs datanode and 
bin/hadoop-daemon start datanode both work.  Starting a secure datanode with 
unprivileged ports will now throw an exception, a difference from the y20 
patch, but one that provides better protection against badly configured 
datanodes.

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1150:
--

Status: Patch Available  (was: Open)

Submitting patch, though I think trunk has been kaiboshed for the moment...

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1150:
--

Status: Open  (was: Patch Available)

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894197#action_12894197
 ] 

Todd Lipcon commented on HDFS-1150:
---

Hi Jakob. Can you please address my comment above? Given that there are ways to 
secure a high port, I don't think it should be a requirement that a secure DN 
*must* start on a low port. We have several clusters where the Hadoop team does 
not have easy root access, but SELinux policies could be used to the same 
effect. If this requirement is important for you guys, maybe we can add a 
configurable like dfs.datanode.require.privileged.port or somesuch?

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-202) Add a bulk FIleSystem.getFileBlockLocations

2010-07-30 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-202:
---

Attachment: hdfsListFiles2.patch

This patch addressed Suresh's review comments.

> Add a bulk FIleSystem.getFileBlockLocations
> ---
>
> Key: HDFS-202
> URL: https://issues.apache.org/jira/browse/HDFS-202
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: hdfsListFiles.patch, hdfsListFiles1.patch, 
> hdfsListFiles2.patch
>
>
> Currently map-reduce applications (specifically file-based input-formats) use 
> FileSystem.getFileBlockLocations to compute splits. However they are forced 
> to call it once per file.
> The downsides are multiple:
># Even with a few thousand files to process the number of RPCs quickly 
> starts getting noticeable
># The current implementation of getFileBlockLocations is too slow since 
> each call results in 'search' in the namesystem. Assuming a few thousand 
> input files it results in that many RPCs and 'searches'.
> It would be nice to have a FileSystem.getFileBlockLocations which can take in 
> a directory, and return the block-locations for all files in that directory. 
> We could eliminate both the per-file RPC and also the 'search' by a 'scan'.
> When I tested this for terasort, a moderate job with 8000 input files the 
> runtime halved from the current 8s to 4s. Clearly this is much more important 
> for latency-sensitive applications...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete

2010-07-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894214#action_12894214
 ] 

Konstantin Shvachko commented on HDFS-:
---

I see there is a reference to my participation in HDFS-729, so there is nobody 
to blame but myself.

I think the lesson with list directories taught us some things. And it has the 
same issue: we do not guarantee that we list all directory entries as a single 
snapshot, because there could be too many of them. We only guarantee to return 
the current consequent list of N entries following the specified name. The rest 
may have changed by the time the list of N is displayed.

With Sriram's approach we actually list blocks of corrupted files and provide 
info about files they belong to. This is different from the previously 
discussed approach. 
- So I propose to rename the method and the respective fsck option to 
{{listCorruptFileBlocks}} instead of {{listCorruptFile}}.

The paging in Sriram's proposal is done by blockId. Since the blocks in the 
{{UnderReplicatedBlocks}} queues are ordered by blockId this will provide more 
natural paging semantics than "skip K and return the next N" - one of the 
variants considered before. Paging by blockId is in a sense the same as in list 
dirs. Fsck guarantees to return a consequent list of N corrupt blocks greater 
than the given id.

ClientProtocol changes. My point is that any new features included in the code 
need to be supported, which is not free. And supporting a feature which is not 
used by anybody is particularly inefficient and even frustrating, not that we 
don't have any of such already. 
RAID may be a good use case for this API, but I agree with Rodrigo it's a topic 
of different discussion and we should take it out of this issue. I sure do not 
have enough context, but may be RAID can query NN for corrupt blocks the same 
way fsck does.


> getCorruptFiles() should give some hint that the list is not complete
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Attachments: HADFS-.0.patch
>
>
> If the list of corruptfiles returned by the namenode doesn't say anything if 
> the number of corrupted files is larger than the call output limit (which 
> means the list is not complete). There should be a way to hint incompleteness 
> to clients.
> A simple hack would be to add an extra entry to the array returned with the 
> value null. Clients could interpret this as a sign that there are other 
> corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more 
> confident when the list is not complete and less confident when the list is 
> known to be incomplete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete

2010-07-30 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894223#action_12894223
 ] 

Rodrigo Schmidt commented on HDFS-:
---

One other thought I have about paging is that it might introduce unnecessary 
complexity. The number of corrupted files is usually low. If it's too high, it 
might be better to run a full fsck. But if there is a strong case for paging, 
I'm fine with it.

ClientProtocol vs. jsp: As I mentioned before, I'm opposed to the fsck strategy 
because it increases the load on the namenode. I've seen a complete cluster 
with thousands of nodes almost go down because there were parallel executions 
of fsck running internally to the namenode and they couldn't be stopped. 
Besides that, using HTTP to get data from the namenode is just another way to 
implement an RPC. The advantage of JSP is that it allows for longer or more 
dynamic outputs, which is not the case here. I'm fine moving this specific 
topic to another JIRA or discussion list.

> getCorruptFiles() should give some hint that the list is not complete
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Attachments: HADFS-.0.patch
>
>
> If the list of corruptfiles returned by the namenode doesn't say anything if 
> the number of corrupted files is larger than the call output limit (which 
> means the list is not complete). There should be a way to hint incompleteness 
> to clients.
> A simple hack would be to add an extra entry to the array returned with the 
> value null. Clients could interpret this as a sign that there are other 
> corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more 
> confident when the list is not complete and less confident when the list is 
> known to be incomplete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1324) Quantifying HDFS Client Latency to understand performance and scalability

2010-07-30 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated HDFS-1324:
---

Attachment: Paper1.pdf
Client_Latency_Observation_v4.0.ppt
watcher.jar

Attachments:
1. Design Document - Whitepaper describing the motivation, test methodology, 
setup and results
2. Information on advantages of this study and the important findings
3. Watcher JAR comprising of the benchmarking module
4. Helper scripts needed to execute the JAR
5. README to execute unit test


> Quantifying HDFS Client Latency to understand performance and scalability
> -
>
> Key: HDFS-1324
> URL: https://issues.apache.org/jira/browse/HDFS-1324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: benchmarks
>Affects Versions: 0.20-append
> Environment: An HDFS cluster of 9 nodes (same rack; 1 rack = 40 nodes 
> to one rack switch) was deployed on Yahoo!'s R&D grid cluster. It comprised 
> of 1 Namenode, 1 JobTracker and 7 Datanodes. client assigned to a separate 
> compute node not part of this HDFS. The characteristics of the testbed are as 
> follows:
> • Hadoop 0.20.1xx - latest Yahoo! Hadoop Security version
> • OS - GNU/Linux x86_64 kernel version 2.6.18
> • Java 1.6
> • Processor - 2 SMP Quad-core Intel Xeon @ 2.5Ghz per node
> • Memory - 16GB RAM per node
>Reporter: Mona Chitnis
> Attachments: Client_Latency_Observation_v4.0.ppt, Paper1.pdf, 
> watcher.jar
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> JAR to measure HDFS client latencies. Runs a process comprising of HDFS 
> operations with helper scripts to specify number of iterations, number of 
> concurrent instances and base directory. Compiles statistics in 2 text files 
> inside base directory which can be read/manipulated using data charts 
> applications (e.g. MS Excel).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1324) Quantifying HDFS Client Latency to understand performance and scalability

2010-07-30 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated HDFS-1324:
---

Attachment: (was: Client_Latency_Observation_v4.0.ppt)

> Quantifying HDFS Client Latency to understand performance and scalability
> -
>
> Key: HDFS-1324
> URL: https://issues.apache.org/jira/browse/HDFS-1324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: benchmarks
>Affects Versions: 0.20-append
> Environment: An HDFS cluster of 9 nodes (same rack; 1 rack = 40 nodes 
> to one rack switch) was deployed on Yahoo!'s R&D grid cluster. It comprised 
> of 1 Namenode, 1 JobTracker and 7 Datanodes. client assigned to a separate 
> compute node not part of this HDFS. The characteristics of the testbed are as 
> follows:
> • Hadoop 0.20.1xx - latest Yahoo! Hadoop Security version
> • OS - GNU/Linux x86_64 kernel version 2.6.18
> • Java 1.6
> • Processor - 2 SMP Quad-core Intel Xeon @ 2.5Ghz per node
> • Memory - 16GB RAM per node
>Reporter: Mona Chitnis
> Attachments: Paper1.pdf, watcher.jar
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> JAR to measure HDFS client latencies. Runs a process comprising of HDFS 
> operations with helper scripts to specify number of iterations, number of 
> concurrent instances and base directory. Compiles statistics in 2 text files 
> inside base directory which can be read/manipulated using data charts 
> applications (e.g. MS Excel).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1324) Quantifying HDFS Client Latency to understand performance and scalability

2010-07-30 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated HDFS-1324:
---

Attachment: README.txt
executor.sh
run_watchers.sh

helper scripts to execute the code 

> Quantifying HDFS Client Latency to understand performance and scalability
> -
>
> Key: HDFS-1324
> URL: https://issues.apache.org/jira/browse/HDFS-1324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: benchmarks
>Affects Versions: 0.20-append
> Environment: An HDFS cluster of 9 nodes (same rack; 1 rack = 40 nodes 
> to one rack switch) was deployed on Yahoo!'s R&D grid cluster. It comprised 
> of 1 Namenode, 1 JobTracker and 7 Datanodes. client assigned to a separate 
> compute node not part of this HDFS. The characteristics of the testbed are as 
> follows:
> • Hadoop 0.20.1xx - latest Yahoo! Hadoop Security version
> • OS - GNU/Linux x86_64 kernel version 2.6.18
> • Java 1.6
> • Processor - 2 SMP Quad-core Intel Xeon @ 2.5Ghz per node
> • Memory - 16GB RAM per node
>Reporter: Mona Chitnis
> Attachments: catstat.sh, Client_Latency_Observation_v4.0.ppt, 
> executor.sh, Paper1.pdf, README.txt, run_watchers.sh, watcher.jar
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> JAR to measure HDFS client latencies. Runs a process comprising of HDFS 
> operations with helper scripts to specify number of iterations, number of 
> concurrent instances and base directory. Compiles statistics in 2 text files 
> inside base directory which can be read/manipulated using data charts 
> applications (e.g. MS Excel).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1324) Quantifying HDFS Client Latency to understand performance and scalability

2010-07-30 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated HDFS-1324:
---

Attachment: catstat.sh

> Quantifying HDFS Client Latency to understand performance and scalability
> -
>
> Key: HDFS-1324
> URL: https://issues.apache.org/jira/browse/HDFS-1324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: benchmarks
>Affects Versions: 0.20-append
> Environment: An HDFS cluster of 9 nodes (same rack; 1 rack = 40 nodes 
> to one rack switch) was deployed on Yahoo!'s R&D grid cluster. It comprised 
> of 1 Namenode, 1 JobTracker and 7 Datanodes. client assigned to a separate 
> compute node not part of this HDFS. The characteristics of the testbed are as 
> follows:
> • Hadoop 0.20.1xx - latest Yahoo! Hadoop Security version
> • OS - GNU/Linux x86_64 kernel version 2.6.18
> • Java 1.6
> • Processor - 2 SMP Quad-core Intel Xeon @ 2.5Ghz per node
> • Memory - 16GB RAM per node
>Reporter: Mona Chitnis
> Attachments: catstat.sh, Client_Latency_Observation_v4.0.ppt, 
> executor.sh, Paper1.pdf, README.txt, run_watchers.sh, watcher.jar
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> JAR to measure HDFS client latencies. Runs a process comprising of HDFS 
> operations with helper scripts to specify number of iterations, number of 
> concurrent instances and base directory. Compiles statistics in 2 text files 
> inside base directory which can be read/manipulated using data charts 
> applications (e.g. MS Excel).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1324) Quantifying HDFS Client Latency to understand performance and scalability

2010-07-30 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated HDFS-1324:
---

Attachment: Client_Latency_Observation_v4.0.ppt

Important features and results

> Quantifying HDFS Client Latency to understand performance and scalability
> -
>
> Key: HDFS-1324
> URL: https://issues.apache.org/jira/browse/HDFS-1324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: benchmarks
>Affects Versions: 0.20-append
> Environment: An HDFS cluster of 9 nodes (same rack; 1 rack = 40 nodes 
> to one rack switch) was deployed on Yahoo!'s R&D grid cluster. It comprised 
> of 1 Namenode, 1 JobTracker and 7 Datanodes. client assigned to a separate 
> compute node not part of this HDFS. The characteristics of the testbed are as 
> follows:
> • Hadoop 0.20.1xx - latest Yahoo! Hadoop Security version
> • OS - GNU/Linux x86_64 kernel version 2.6.18
> • Java 1.6
> • Processor - 2 SMP Quad-core Intel Xeon @ 2.5Ghz per node
> • Memory - 16GB RAM per node
>Reporter: Mona Chitnis
> Attachments: catstat.sh, Client_Latency_Observation_v4.0.ppt, 
> executor.sh, Paper1.pdf, README.txt, run_watchers.sh, watcher.jar
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> JAR to measure HDFS client latencies. Runs a process comprising of HDFS 
> operations with helper scripts to specify number of iterations, number of 
> concurrent instances and base directory. Compiles statistics in 2 text files 
> inside base directory which can be read/manipulated using data charts 
> applications (e.g. MS Excel).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894236#action_12894236
 ] 

Jakob Homan commented on HDFS-1150:
---

Sure. -1 on allowing unsecured datanodes to join a secure cluster, and at the 
moment Hadoop doesn't have a non-jsvc way of securing/verifying datanodes' 
ports.

Currently, we secure the datanodes via jsvc, and the reasons for doing so were 
discussed extensively on this JIRA.  Were we to allow the behavior requested, a 
mis-configured cluster could end up partially unsecured with no warning that it 
is in such a state, which is not acceptable.

What you're asking for is essentially to make securing the datanodes' non-RPC 
ports pluggable, which we fully expect and plan to do.  I'll open a JIRA to 
make datanode-port security pluggable once 1150 has been finished off.  jsvc 
was a reliable solution to a problem discovered very late in security's 
development, which has worked very well on our production clusters, but 
certainly still has the odor of a hack about it.  All that's needed is a way of 
auditing and verifying that the ports we're running are on are secure by Ops' 
estimation; jsvc, SELinux, AppArmor will all be reasonable ways of fulfilling 
such a contract. 

But until we actually have a plan to implement this in a reliable, verifiable 
and documented way, it's best to err on the side of caution and security and 
provide as much guarantee as possible that the datanodes are indeed secure in a 
secure cluster.  Until we support non-jsvc methods of doing this, it's not 
going to work to have a non-jsvc verified datanode.

As far as a config as mentioned above, it would essentially be 
my.cluster.is.secure.except.for.this.one.attack.vector, which is not a good 
idea for the same reasons as above - it's a huge configuration mistake waiting 
to happen - and moreover will be unnecessary once a fully pluggable system is 
in place.  The one place it would be very useful and justifiable would be for 
developer testing, since it is a serious pain to start up these secure nodes 
while doing development now.

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894237#action_12894237
 ] 

Todd Lipcon commented on HDFS-1150:
---

bq. Until we support non-jsvc methods of doing this, it's not going to work to 
have a non-jsvc verified datanode

The point is that we already "support" the SELinux way - you configure the 
datanode to a high port, and then set up the SELinux policy to only allow the 
HDFS user to bind that high port. No Hadoop-side support is necessary, but the 
current implementation prohibits this mechanism, which I don't think is right.

bq. it would essentially be 
my.cluster.is.secure.except.for.this.one.attack.vector, which is not a good 
idea for the same reasons as above

Think of it like 
i.already.secured.my.datanode.port.with.some.external.mechanism :) I'm OK with 
this config being default to false (ie refuse to start non-secure), but it 
needs to be configurable. The developer testing case you mentioned is another 
good example.

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894238#action_12894238
 ] 

Jakob Homan commented on HDFS-1150:
---

bq. The point is that we already "support" the SELinux way
If your "support" for security comes with quotation marks, you've got a 
problem.  

bq. Think of it like 
i.already.secured.my.datanode.port.with.some.external.mechanism
This is a more reasonable way of arguing your request.  Essentially it's the 
beginning of a pluggable system where the verification is Ops' word that 
they've taken care of this security hole. My concern remains that this provides 
no way for a running cluster to realize it's been misconfigured, although I've 
been thinking we need a security info page on the NN/JT (along with HADOOP-6823 
and HADOOP-6822) and this could be displayed there (although the danger for 
non-updated, erroneous configs scattered around the cluster still remains).  
Administrators would need to affirmatively decline this type of protection, 
perhaps with a value to the key of "No, thanks."

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-07-30 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894240#action_12894240
 ] 

Todd Lipcon commented on HDFS-1150:
---

bq. If your "support" for security comes with quotation marks, you've got a 
problem.

The quotes were to say that we don't need to explicitly add support in order to 
use an external mechanism. Securing a high port with SELinux is equally as good 
as the jsvc solution that uses a low port. On Solaris you can use user-based 
privileges to grant the hdfs user access to bind to a low port. This is also at 
least as good as the jsvc solution and doesn't require any code changes to 
Hadoop. I would in fact argue that both of these solutions are *more* secure, 
since the Hadoop administrators don't need root on the system except for the 
initial host configuration.

bq. Administrators would need to affirmatively decline this type of protection, 
perhaps with a value to the key of "No, thanks."

Hence my above point that it should be a configuration with the default as you 
suggested -- something that advanced users (and developers) can override.

I also disagree with your general point that Hadoop should make it impossible 
to misconfigure it in such a way that there are security holes. Already with 
your solution you're relying on ops to provide some of the security - for 
example, if users have root on any machine in the same subnet, they can take 
over the IP of one of the datanodes by spamming arps. So long as we're taking 
the shortcut instead of actually putting SASL on the xceiver protocol, we need 
external security. I agree completely with the decision to take the workaround 
in the short term, but making arguments about "security" vs real security seems 
strange given the context.

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete

2010-07-30 Thread Sriram Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894252#action_12894252
 ] 

Sriram Rao commented on HDFS-:
--

The case for paging was made by you (?) in one of the JIRAs on this issue.  You 
went looking for list of files in an important dir and found that the 500 limit 
was getting in the way.  

The patch that you have done has the namenode doing the filtering (and this has 
caused problems).

What we are proposing instead, is to have the namenode return a list of corrupt 
files to the client and then let the client do the filtering.  The way we 
envision using this feature is via an iterative approach to fixing corruption:
1. get a list of corrupt files for a certain path 
2. fix up the corrupt files in that path
3. iterate; stop if the list of corrupt files is empty

By being iterative, this proposal also addresses one of the issues you had 
brought up: namely, the list of corrupt files can change between successive 
paging calls.  

Fsck is a fall-back.  With PBs that we have in our clusters, a full Fsck does 
take a few hours to finish.


> getCorruptFiles() should give some hint that the list is not complete
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Attachments: HADFS-.0.patch
>
>
> If the list of corruptfiles returned by the namenode doesn't say anything if 
> the number of corrupted files is larger than the call output limit (which 
> means the list is not complete). There should be a way to hint incompleteness 
> to clients.
> A simple hack would be to add an extra entry to the array returned with the 
> value null. Clients could interpret this as a sign that there are other 
> corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more 
> confident when the list is not complete and less confident when the list is 
> known to be incomplete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.