[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HDFS-1292: - Status: Patch Available (was: Open) Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HDFS-1292: - Status: Open (was: Patch Available) Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HDFS-1292: - Status: Open (was: Patch Available) Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HDFS-1292: - Status: Patch Available (was: Open) Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HDFS-1292: - Status: Open (was: Patch Available) Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HDFS-1292: - Status: Patch Available (was: Open) Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1383: - Attachment: h1383_20100915_y20.patch h1383_20100915_y20.patch: just cheking s!= null and added a few unit tests. Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909843#action_12909843 ] Tsz Wo (Nicholas), SZE commented on HDFS-1383: -- Below is the outputs after the patch. {noformat} [r...@host yahoo-hadoop-0.20.1xx]# ./bin/hadoop fs -cat hftp://host.xx.yy:50070/user/root/foo/s.txt cat: user=root, access=EXECUTE, inode=foo:root:supergroup:- [r...@host yahoo-hadoop-0.20.1xx]# ./bin/hadoop fs -cat hftp://host.xx.yy:50070/user/root/foo cat: user=root, access=READ_EXECUTE, inode=foo:root:supergroup:- [r...@host yahoo-hadoop-0.20.1xx]# ./bin/hadoop fs -cat hftp://host.xx.yy:50070/user/root/bar cat: /user/root/bar is a directory (error code=400) {noformat} Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-779) Automatic move to safe-mode when cluster size drops
[ https://issues.apache.org/jira/browse/HDFS-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909844#action_12909844 ] Robert Chansler commented on HDFS-779: -- I think Dhruba's comment #2 from the 13th supports my contention that counting missing replicas is the key ides. The loss of empty nodes has not been a problem. Too many missing replicas--regardless of the number of missing nodes--has been a problem. And the issue is whether there is a catastrophic circumstance _right now_ rather than whether today is (much) _worse than yesterday_. Does Dhruba's suggestion protect against things becoming exponentially bad but at rate less than _m_? But supposing a catastrophe is declared by whatever policy, how should the system behave? Retreat to safe mode is intuitively understandable, and answers a lot of questions. I'm always reluctant to withdraw service, and so catastrophe should mean that the users are going to lose in any case. I'm even more reluctant to allow HDFS to continue in an operating mode where things seem to work, but replication has been suspended. If a zillion replicas are missing, the system requires professional attention from an administrator. Automatic move to safe-mode when cluster size drops --- Key: HDFS-779 URL: https://issues.apache.org/jira/browse/HDFS-779 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Reporter: Owen O'Malley Assignee: dhruba borthakur As part of looking at using Kerberos, we want to avoid the case where both the primary (and optional secondary) KDC go offline causing a replication storm as the DataNodes' service tickets time out and they lose the ability to connect to the NameNode. However, this is a specific case of a more general problem of loosing too many nodes too quickly. I think we should have an option to go into safe mode if the cluster size goes down more than N% in terms of DataNodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909854#action_12909854 ] Suresh Srinivas commented on HDFS-1383: --- +1 for the patch. One minor comment - There is a change in UGI with this patch. Was that intentionally introduced by this patch? Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909864#action_12909864 ] Tsz Wo (Nicholas), SZE commented on HDFS-1383: -- No, I don't want to change UGI. Thanks for reviewing it. Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1383: - Attachment: h1383_20100915b_y20.patch h1383_20100915b_y20.patch: reverted the change in UserGroupInformation Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch, h1383_20100915b_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1399) Distinct minicluster services (e.g. NN and JT) overwrite each other's service policies
[ https://issues.apache.org/jira/browse/HDFS-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1399: - Attachment: hdfs-1399.1.txt Updated patch to address Todd's comments. Distinct minicluster services (e.g. NN and JT) overwrite each other's service policies -- Key: HDFS-1399 URL: https://issues.apache.org/jira/browse/HDFS-1399 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.22.0 Attachments: hdfs-1399.1.txt, hdfs-1399.txt.0 HDFS portion of HADOOP-6951. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Moved: (HDFS-1402) Optimize input split creation
[ https://issues.apache.org/jira/browse/HDFS-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Burkhardt moved MAPREDUCE-1973 to HDFS-1402: - Project: Hadoop HDFS (was: Hadoop Map/Reduce) Key: HDFS-1402 (was: MAPREDUCE-1973) Affects Version/s: 0.22.0 (was: 0.20.1) (was: 0.20.2) Optimize input split creation - Key: HDFS-1402 URL: https://issues.apache.org/jira/browse/HDFS-1402 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Environment: Intel Nehalem cluster running Red Hat. Reporter: Paul Burkhardt Priority: Minor Attachments: HADOOP-1973.patch The input split returns the locations that host the file blocks in the split. The locations are determined by the getBlockLocations method of the filesystem client which requires a remote connection to the filesystem (i.e. HDFS). The remote connection is made for each file in the entire input split. For jobs with many input files the network connections dominate the cost of writing the input split file. A job requests a listing of the input files from the remote filesystem and creates a FileStatus object as a handle for each file in the listing. The FileStatus object can be imbued with the necessary host information on the remote end and passed to the client-side in the bulk return of the listing request. A getHosts method of the FileStatus would then return the locations for the blocks comprising that file and eliminate the need for another trip to the remote filesystem. The INodeFile maintains the blocks for a file and is an obvious choice to be the originator for the locations of that file. It is also available to the FSDirectory which first creates the listing of FileStatus objects. We propose that the block locations be generated by the INodeFile to instantiate the FileStatus object during the getListing request. Our tests demonstrated a factor of 2000 speedup for approximately 60,000 input files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1402) Optimize input split creation
[ https://issues.apache.org/jira/browse/HDFS-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Burkhardt updated HDFS-1402: - Attachment: HDFS-1402.patch HDFS-1402.common.patch Patched against the trunk. Optimize input split creation - Key: HDFS-1402 URL: https://issues.apache.org/jira/browse/HDFS-1402 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Environment: Intel Nehalem cluster running Red Hat. Reporter: Paul Burkhardt Priority: Minor Attachments: HADOOP-1973.patch, HDFS-1402.common.patch, HDFS-1402.patch The input split returns the locations that host the file blocks in the split. The locations are determined by the getBlockLocations method of the filesystem client which requires a remote connection to the filesystem (i.e. HDFS). The remote connection is made for each file in the entire input split. For jobs with many input files the network connections dominate the cost of writing the input split file. A job requests a listing of the input files from the remote filesystem and creates a FileStatus object as a handle for each file in the listing. The FileStatus object can be imbued with the necessary host information on the remote end and passed to the client-side in the bulk return of the listing request. A getHosts method of the FileStatus would then return the locations for the blocks comprising that file and eliminate the need for another trip to the remote filesystem. The INodeFile maintains the blocks for a file and is an obvious choice to be the originator for the locations of that file. It is also available to the FSDirectory which first creates the listing of FileStatus objects. We propose that the block locations be generated by the INodeFile to instantiate the FileStatus object during the getListing request. Our tests demonstrated a factor of 2000 speedup for approximately 60,000 input files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1383: - Status: Patch Available (was: Open) Fix Version/s: 0.22.0 Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch, h1383_20100915b.patch, h1383_20100915b_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1383: - Attachment: h1383_20100915b.patch h1383_20100915b.patch: for trunk Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch, h1383_20100915b.patch, h1383_20100915b_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1402) Optimize input split creation
[ https://issues.apache.org/jira/browse/HDFS-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909924#action_12909924 ] Paul Burkhardt commented on HDFS-1402: -- I decided to patch against the trunk. The changes span both HDFS and Common but I attached two separate patches to this ticket for now. As previously noted, this patch addresses the same core issue as HDFS-202. My concern is HDFS-202 adds a parallel set of interfaces to support file status objects with location information. My argument is the locations of a file should be a first-class attribute shared by all file types. If we force an interface, getHosts or getLocations, for any file status type we can simplify the client and server API for creating and listing file status objects. File status types from a distributed file system, i.e. HDFS, return the hosts for the file blocks whereas a file status type from a non-distributed or local file system would return a single host, all by the same interface. Optimize input split creation - Key: HDFS-1402 URL: https://issues.apache.org/jira/browse/HDFS-1402 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Environment: Intel Nehalem cluster running Red Hat. Reporter: Paul Burkhardt Priority: Minor Attachments: HADOOP-1973.patch, HDFS-1402.common.patch, HDFS-1402.patch The input split returns the locations that host the file blocks in the split. The locations are determined by the getBlockLocations method of the filesystem client which requires a remote connection to the filesystem (i.e. HDFS). The remote connection is made for each file in the entire input split. For jobs with many input files the network connections dominate the cost of writing the input split file. A job requests a listing of the input files from the remote filesystem and creates a FileStatus object as a handle for each file in the listing. The FileStatus object can be imbued with the necessary host information on the remote end and passed to the client-side in the bulk return of the listing request. A getHosts method of the FileStatus would then return the locations for the blocks comprising that file and eliminate the need for another trip to the remote filesystem. The INodeFile maintains the blocks for a file and is an obvious choice to be the originator for the locations of that file. It is also available to the FSDirectory which first creates the listing of FileStatus objects. We propose that the block locations be generated by the INodeFile to instantiate the FileStatus object during the getListing request. Our tests demonstrated a factor of 2000 speedup for approximately 60,000 input files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-375) DFSClient cpu overhead is too high
[ https://issues.apache.org/jira/browse/HDFS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-375. - Resolution: Not A Problem I believe this issue went stale. Closing. DFSClient cpu overhead is too high -- Key: HDFS-375 URL: https://issues.apache.org/jira/browse/HDFS-375 Project: Hadoop HDFS Issue Type: Improvement Reporter: Runping Qi When we do dfs throughput test using hadoop dfs -cat, we have observed that the client side cpu usage is very high, 3 to five times that of a data node serving the file. Before 0.18, the data node cpu usage was equally high, and this problem is fixed since 0.18. However, the client side problem still exists. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-198) org.apache.hadoop.dfs.LeaseExpiredException during dfs write
[ https://issues.apache.org/jira/browse/HDFS-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-198. - Resolution: Not A Problem I believe this issue went stale. Closing. org.apache.hadoop.dfs.LeaseExpiredException during dfs write Key: HDFS-198 URL: https://issues.apache.org/jira/browse/HDFS-198 Project: Hadoop HDFS Issue Type: Bug Reporter: Runping Qi Many long running cpu intensive map tasks failed due to org.apache.hadoop.dfs.LeaseExpiredException. Here is except from the log: 2008-10-26 11:54:17,282 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.dfs.LeaseExpiredException: No lease on /xxx/_temporary/_task_200810232126_0001_m_33_0/part-00033 File does not exist. [Lease. Holder: 44 46 53 43 6c 69 65 6e 74 5f 74 61 73 6b 5f 32 30 30 38 31 30 32 33 32 31 32 36 5f 30 30 30 31 5f 6d 5f 30 30 30 30 33 33 5f 30, heldlocks: 0, pendingcreates: 1] at org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1194) at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1125) at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300) at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) at org.apache.hadoop.ipc.Client.call(Client.java:557) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2335) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2220) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842) 2008-10-26 11:54:17,282 WARN org.apache.hadoop.dfs.DFSClient: NotReplicatedYetException sleeping /xxx/_temporary/_task_200810232126_0001_m_33_0/part-00033 retries left 2 2008-10-26 11:54:18,886 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.dfs.LeaseExpiredException: No lease on /xxx/_temporary/_task_200810232126_0001_m_33_0/part-00033 File does not exist. [Lease. Holder: 44 46 53 43 6c 69 65 6e 74 5f 74 61 73 6b 5f 32 30 30 38 31 30 32 33 32 31 32 36 5f 30 30 30 31 5f 6d 5f 30 30 30 30 33 33 5f 30, heldlocks: 0, pendingcreates: 1] at org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1194) at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1125) at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300) at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) at org.apache.hadoop.ipc.Client.call(Client.java:557) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source) at
[jira] Resolved: (HDFS-106) DataNode log message includes toString of an array
[ https://issues.apache.org/jira/browse/HDFS-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-106. - Resolution: Not A Problem Thanks Rong-En for checking it. Closing. DataNode log message includes toString of an array -- Key: HDFS-106 URL: https://issues.apache.org/jira/browse/HDFS-106 Project: Hadoop HDFS Issue Type: Bug Reporter: Nigel Daley Priority: Minor DataNode.java line 596: LOG.info(Starting thread to transfer block + blocks[i] + to + xferTargets[i]); xferTargets is a two dimensional array, so this line calls toString on the array referenced by xferTargets[i]. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1292) Allow artifacts to be published to the staging Apache Nexus Maven Repository
[ https://issues.apache.org/jira/browse/HDFS-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HDFS-1292: Attachment: HDFS-1292.patch Thanks for working on this, Giri. I tried to upload some artifacts to the staging repository but got the following error when I tried to close the release: {noformat} Staging Signature Validation -Missing Signature: '/org/apache/hadoop/hadoop-hdfs-instrumented/0.22.0/hadoop-hdfs-instrumented-0.22.0-sources.jar.asc' does not exist for 'hadoop-hdfs-instrumented-0.22.0-sources.jar'. -Invalid Signature: '/org/apache/hadoop/hadoop-hdfs-instrumented/0.22.0/hadoop-hdfs-instrumented-0.22.0.jar.asc' is not a valid signature for 'hadoop-hdfs-instrumented-0.22.0.jar'. -Missing Signature: '/org/apache/hadoop/hadoop-hdfs-test/0.22.0/hadoop-hdfs-test-0.22.0-sources.jar.asc' does not exist for 'hadoop-hdfs-test-0.22.0-sources.jar'. -Invalid Signature: '/org/apache/hadoop/hadoop-hdfs-test/0.22.0/hadoop-hdfs-test-0.22.0.jar.asc' is not a valid signature for 'hadoop-hdfs-test-0.22.0.jar'. -Invalid Signature: '/org/apache/hadoop/hadoop-hdfs/0.22.0/hadoop-hdfs-0.22.0.jar.asc' is not a valid signature for 'hadoop-hdfs-0.22.0.jar'. -Missing Signature: '/org/apache/hadoop/hadoop-hdfs/0.22.0/hadoop-hdfs-0.22.0-sources.jar.asc' does not exist for 'hadoop-hdfs-0.22.0-sources.jar'. -Invalid Signature: '/org/apache/hadoop/hadoop-hdfs-instrumented-test/0.22.0/hadoop-hdfs-instrumented-test-0.22.0.jar.asc' is not a valid signature for 'hadoop-hdfs-instrumented-test-0.22.0.jar'. -Missing Signature: '/org/apache/hadoop/hadoop-hdfs-instrumented-test/0.22.0/hadoop-hdfs-instrumented-test-0.22.0-sources.jar.asc' does not exist for 'hadoop-hdfs-instrumented-test-0.22.0-sources.jar'. {noformat} I think that the sources.jar.asc is overwriting the jar.asc file. This can be fixed by adding a {{classifier=sources}} attribute to attach element. I made this change in the attached patch, and the close was successful. Allow artifacts to be published to the staging Apache Nexus Maven Repository Key: HDFS-1292 URL: https://issues.apache.org/jira/browse/HDFS-1292 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: HDFS-1292.patch, hdfs-1292.patch HDFS companion issue to HADOOP-6847. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1403) add -truncate option to fsck
add -truncate option to fsck Key: HDFS-1403 URL: https://issues.apache.org/jira/browse/HDFS-1403 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client, name-node Reporter: sam rash When running fsck, it would be useful to be able to tell hdfs to truncate any corrupt file to the last valid position in the latest block. Then, when running hadoop fsck, an admin can cleanup the filesystem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909986#action_12909986 ] Tsz Wo (Nicholas), SZE commented on HDFS-1383: -- Ran unit tests. TestFiHFlush failed. See HDFS-1206. Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch, h1383_20100915b.patch, h1383_20100915b_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909987#action_12909987 ] Tsz Wo (Nicholas), SZE commented on HDFS-1383: -- ant test-patch {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 17 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. {noformat} Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch, h1383_20100915b.patch, h1383_20100915b_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1383) Better error messages on hftp
[ https://issues.apache.org/jira/browse/HDFS-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909989#action_12909989 ] Tsz Wo (Nicholas), SZE commented on HDFS-1383: -- Tested manually again. It works fine. {noformat} [r...@yahoo-hadoop-0.20.1xx]# ./bin/hadoop fs -cat /user/tsz/r.txt cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=READ, inode=r.txt:tsz:supergroup:- {noformat} Better error messages on hftp -- Key: HDFS-1383 URL: https://issues.apache.org/jira/browse/HDFS-1383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h1383_20100913_y20.patch, h1383_20100915_y20.patch, h1383_20100915b.patch, h1383_20100915b_y20.patch If the file is not accessible, HftpFileSystem returns only a HTTP response code. {noformat} 2010-08-27 20:57:48,091 INFO org.apache.hadoop.tools.DistCp: FAIL README.txt : java.io.IOException: Server returned HTTP response code: 400 for URL: http:/namenode:50070/data/user/tsz/README.txt?ugi=tsz,users at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290) at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:143) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) ... {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1403) add -truncate option to fsck
[ https://issues.apache.org/jira/browse/HDFS-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910003#action_12910003 ] dhruba borthakur commented on HDFS-1403: This is especially needed when the system supports hflush. A client could issue a hflush, it will persist block locations in the namenode. Then the client could fail even before it could write any bytes to that block. In this case, the last block of the file will be permanently missing. It would be nice to have an option to fsck to delete the last block of a file if it is of size zero and does not have any valid replicas. add -truncate option to fsck Key: HDFS-1403 URL: https://issues.apache.org/jira/browse/HDFS-1403 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client, name-node Reporter: sam rash When running fsck, it would be useful to be able to tell hdfs to truncate any corrupt file to the last valid position in the latest block. Then, when running hadoop fsck, an admin can cleanup the filesystem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.