[jira] [Updated] (HDFS-10835) Typo in httpfs.sh hadoop_usage
[ https://issues.apache.org/jira/browse/HDFS-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10835: -- Status: Patch Available (was: Open) > Typo in httpfs.sh hadoop_usage > -- > > Key: HDFS-10835 > URL: https://issues.apache.org/jira/browse/HDFS-10835 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Attachments: HDFS-10835.001.patch > > > Typo in method {{hadoop_usage}} of {{httpfs.sh}}. The {{kms}} words should be > replaced with {{httpfs}}: > {noformat} > function hadoop_usage > { > hadoop_add_subcommand "run" "Start kms in the current window" > hadoop_add_subcommand "run -security" "Start in the current window with > security manager" > hadoop_add_subcommand "start" "Start kms in a separate window" > hadoop_add_subcommand "start -security" "Start in a separate window with > security manager" > hadoop_add_subcommand "status" "Return the LSB compliant status" > hadoop_add_subcommand "stop" "Stop kms, waiting up to 5 seconds for the > process to end" > hadoop_add_subcommand "top n" "Stop kms, waiting up to n seconds for the > process to end" > hadoop_add_subcommand "stop -force" "Stop kms, wait up to 5 seconds and > then use kill -KILL if still running" > hadoop_add_subcommand "stop n -force" "Stop kms, wait up to n seconds and > then use kill -KILL if still running" > hadoop_generate_usage "${MYNAME}" false > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10835) Typo in httpfs.sh hadoop_usage
[ https://issues.apache.org/jira/browse/HDFS-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10835: -- Attachment: HDFS-10835.001.patch Patch 001: * Replace words {{kms}} with {{https}} * No unit test for startup script Manual test output: {noformat} $ sbin/httpfs.sh -h Usage: httpfs.sh [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS] OPTIONS is none or any of: --config dir Hadoop config directory --debugturn on shell script debug mode --help usage information SUBCOMMAND is one of: run -security Start in the current window with security manager run Start httpfs in the current window start -security Start in a separate window with security manager start Start httpfs in a separate window statusReturn the LSB compliant status stop -force Stop httpfs, wait up to 5 seconds and then use kill -KILL if still running stop n -force Stop httpfs, wait up to n seconds and then use kill -KILL if still running stop Stop httpfs, waiting up to 5 seconds for the process to end top n Stop httpfs, waiting up to n seconds for the process to end SUBCOMMAND may print help when invoked w/o parameters or with -h. {noformat} > Typo in httpfs.sh hadoop_usage > -- > > Key: HDFS-10835 > URL: https://issues.apache.org/jira/browse/HDFS-10835 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Attachments: HDFS-10835.001.patch > > > Typo in method {{hadoop_usage}} of {{httpfs.sh}}. The {{kms}} words should be > replaced with {{httpfs}}: > {noformat} > function hadoop_usage > { > hadoop_add_subcommand "run" "Start kms in the current window" > hadoop_add_subcommand "run -security" "Start in the current window with > security manager" > hadoop_add_subcommand "start" "Start kms in a separate window" > hadoop_add_subcommand "start -security" "Start in a separate window with > security manager" > hadoop_add_subcommand "status" "Return the LSB compliant status" > hadoop_add_subcommand "stop" "Stop kms, waiting up to 5 seconds for the > process to end" > hadoop_add_subcommand "top n" "Stop kms, waiting up to n seconds for the > process to end" > hadoop_add_subcommand "stop -force" "Stop kms, wait up to 5 seconds and > then use kill -KILL if still running" > hadoop_add_subcommand "stop n -force" "Stop kms, wait up to n seconds and > then use kill -KILL if still running" > hadoop_generate_usage "${MYNAME}" false > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10835) Typo in httpfs.sh hadoop_usage
John Zhuge created HDFS-10835: - Summary: Typo in httpfs.sh hadoop_usage Key: HDFS-10835 URL: https://issues.apache.org/jira/browse/HDFS-10835 Project: Hadoop HDFS Issue Type: Bug Components: httpfs Affects Versions: 2.6.0 Reporter: John Zhuge Assignee: John Zhuge Priority: Trivial Typo in method {{hadoop_usage}} of {{httpfs.sh}}. The {{kms}} words should be replaced with {{httpfs}}: {noformat} function hadoop_usage { hadoop_add_subcommand "run" "Start kms in the current window" hadoop_add_subcommand "run -security" "Start in the current window with security manager" hadoop_add_subcommand "start" "Start kms in a separate window" hadoop_add_subcommand "start -security" "Start in a separate window with security manager" hadoop_add_subcommand "status" "Return the LSB compliant status" hadoop_add_subcommand "stop" "Stop kms, waiting up to 5 seconds for the process to end" hadoop_add_subcommand "top n" "Stop kms, waiting up to n seconds for the process to end" hadoop_add_subcommand "stop -force" "Stop kms, wait up to 5 seconds and then use kill -KILL if still running" hadoop_add_subcommand "stop n -force" "Stop kms, wait up to n seconds and then use kill -KILL if still running" hadoop_generate_usage "${MYNAME}" false } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15459574#comment-15459574 ] John Zhuge commented on HDFS-10684: --- Thanks [~andrew.wang] for the comment. I am looking into unit testing to ensure compatibility. > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge >Priority: Blocker > Labels: compatibility, webhdfs > Attachments: HDFS-10684.001-branch-2.patch > > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10822) Log DataNodes in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15459222#comment-15459222 ] John Zhuge commented on HDFS-10822: --- Thanks [~eddyxu] for the review and commit. > Log DataNodes in the write pipeline > --- > > Key: HDFS-10822 > URL: https://issues.apache.org/jira/browse/HDFS-10822 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Labels: supportability > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: HDFS-10822.001.patch > > > Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did > not know which DNs were involved in the write pipeline. Of course, I could > search NN log for the list of DNs, but it might not be possible sometimes or > convenient. > Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of > DNs in the pipeline. > BTW, we do print the list of DNs during pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446784#comment-15446784 ] John Zhuge edited comment on HDFS-10684 at 9/1/16 9:52 PM: --- [~loungerdork], I was able to reproduce the problem with NN running 2.7.1 and DN running 2.8. HDFS-8435 (in 2.8 but not in 2.7) introduced new parameter {{createparent}} to op {{CREATE}}. was (Author: jzhuge): [~loungerdork], I was able to reproduce the problem with NN running 2.7.1 and DN running 2.8. HDFS-8435 (in 2.8 but not in 2.7) introduced new parameter {{createparent}}. > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge >Priority: Blocker > Labels: compatibility, webhdfs > Attachments: HDFS-10684.001-branch-2.patch > > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15456003#comment-15456003 ] John Zhuge edited comment on HDFS-10684 at 9/1/16 7:22 PM: --- Patch 001-branch-2: * Start with branch-2 patch because mixed version testing is only possible between 2.7 and branch-2. * There is no unit testing due to the difficulty of mixed versions of NNs and DNs. * Pass the JIRA test case manually between an 2.7 NN and a branch-2 DN. was (Author: jzhuge): Patch 001-branch-2: * Start with branch-2 patch because mixed version testing is only probably between 2.7 and branch-2. Should merge the change to 2.8 and later. * There is no unit testing due to the difficulty of mixed versions of NNs and DNs * Pass the JIRA test case manually between an 2.7 NN and a branch-2 DN. > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > Labels: compatibility > Attachments: HDFS-10684.001-branch-2.patch > > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10822) Log DataNodes in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10822: -- Status: Patch Available (was: Open) > Log DataNodes in the write pipeline > --- > > Key: HDFS-10822 > URL: https://issues.apache.org/jira/browse/HDFS-10822 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Labels: supportability > Attachments: HDFS-10822.001.patch > > > Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did > not know which DNs were involved in the write pipeline. Of course, I could > search NN log for the list of DNs, but it might not be possible sometimes or > convenient. > Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of > DNs in the pipeline. > BTW, we do print the list of DNs during pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10822) Log DataNodes in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10822: -- Attachment: HDFS-10822.001.patch Patch 001: * {{initDataStreaming}} is a better place to add DEBUG msg because the log msg will have {{file}}, {{block}}, and {{pipeline}} all together. * No unit test because the patch adds DEBUG msg only. {{TestDataStream}} log that shows the new DEBUG msg: {noformat} 2016-09-01 11:54:18,470 [DataStreamer for file /file1 block BP-1484500550-172.16.1.255-1472756056162:blk_1073741825_1001] DEBUG hdfs.DataStreamer (DataStreamer.java:initDataStreaming(512)) - nodes [DatanodeInfoWithStorage[127.0.0.1:61765,DS-124a1018-6033-4f81-bc43-cba740bd9538,DISK]] storageTypes [DISK] storageIDs [DS-124a1018-6033-4f81-bc43-cba740bd9538] {noformat} > Log DataNodes in the write pipeline > --- > > Key: HDFS-10822 > URL: https://issues.apache.org/jira/browse/HDFS-10822 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Labels: supportability > Attachments: HDFS-10822.001.patch > > > Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did > not know which DNs were involved in the write pipeline. Of course, I could > search NN log for the list of DNs, but it might not be possible sometimes or > convenient. > Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of > DNs in the pipeline. > BTW, we do print the list of DNs during pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Status: Patch Available (was: Open) > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > Labels: compatibility > Attachments: HDFS-10684.001-branch-2.patch > > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Attachment: HDFS-10684.001-branch-2.patch Patch 001-branch-2: * Start with branch-2 patch because mixed version testing is only probably between 2.7 and branch-2. Should merge the change to 2.8 and later. * There is no unit testing due to the difficulty of mixed versions of NNs and DNs * Pass the JIRA test case manually between an 2.7 NN and a branch-2 DN. > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > Labels: compatibility > Attachments: HDFS-10684.001-branch-2.patch > > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Labels: compatibility (was: ) > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > Labels: compatibility > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Target Version/s: 2.8.0 > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > Labels: compatibility > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455957#comment-15455957 ] John Zhuge commented on HDFS-10684: --- To support mixed versions of NNs and DNs in webhdfs, we have to make sure {{null str}} is handled in {{XParam(final String str)}} constructors, either passing {{DEFAULT}} to superclass, or superclass already handling {{null str}}: {code} public XParam(final String str) { super(DOMAIN, DOMAIN.parse(str == null ? DEFAULT : str)); } {code} A quick survey between 2.7 and branch-2.8 yields the following new params: * CreateFlagParam * NoRedirectParam * StartAfterParam * CreateParentParam, not new, but its usage expanded to {{CREATE}} No new param added between branch-2.8 and trunk. I don't know whether there is any other case like {{CreateParentParam}} where an existing param was expanded in its usage. If anybody comes across one, please let me know or file an JIRA. > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10822) Log DataNodes in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453441#comment-15453441 ] John Zhuge commented on HDFS-10822: --- Thanks [~kihwal] for bringing up HDFS-8791. > Log DataNodes in the write pipeline > --- > > Key: HDFS-10822 > URL: https://issues.apache.org/jira/browse/HDFS-10822 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Labels: supportability > > Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did > not know which DNs were involved in the write pipeline. Of course, I could > search NN log for the list of DNs, but it might not be possible sometimes or > convenient. > Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of > DNs in the pipeline. > BTW, we do print the list of DNs during pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10822) Log DataNodes in the write pipeline
John Zhuge created HDFS-10822: - Summary: Log DataNodes in the write pipeline Key: HDFS-10822 URL: https://issues.apache.org/jira/browse/HDFS-10822 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.6.0 Reporter: John Zhuge Assignee: John Zhuge Priority: Trivial Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did not know which DNs were involved in the write pipeline. Of course, I could search NN log for the list of DNs, but it might not be possible sometimes or convenient. Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of DNs in the pipeline. BTW, we do print the list of DNs during pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) Helpful exception when DNS entry for JournalNode cannot be resolved
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Attachment: HDFS-4210.004.patch Patch 004: * Xiao's comments > Helpful exception when DNS entry for JournalNode cannot be resolved > --- > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch, HDFS-4210.004.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) Helpful exception when DNS entry for JournalNode cannot be resolved
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Summary: Helpful exception when DNS entry for JournalNode cannot be resolved (was: NameNode should throw UnknownHostException when a DNS entry for a quorum node cannot be resolved) > Helpful exception when DNS entry for JournalNode cannot be resolved > --- > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) NameNode should throw UnknownHostException when a DNS entry for a quorum node cannot be resolved
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Summary: NameNode should throw UnknownHostException when a DNS entry for a quorum node cannot be resolved (was: NameNode should fail when a DNS entry for a quorum node cannot be resolved) > NameNode should throw UnknownHostException when a DNS entry for a quorum node > cannot be resolved > > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail when parameter createparent not provided
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Summary: WebHDFS DataNode calls fail when parameter createparent not provided (was: WebHDFS DataNode calls fail when boolean parameters not provided) > WebHDFS DataNode calls fail when parameter createparent not provided > > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Summary: WebHDFS DataNode calls fail without parameter createparent (was: WebHDFS DataNode calls fail when parameter createparent not provided) > WebHDFS DataNode calls fail without parameter createparent > -- > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-4210) NameNode should fail when a DNS entry for a quorum node cannot be resolved
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446829#comment-15446829 ] John Zhuge commented on HDFS-4210: -- Change the title to reflect the intended fix and be consistent with HDFS-4957. > NameNode should fail when a DNS entry for a quorum node cannot be resolved > -- > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) NameNode should fail when a DNS entry for a quorum node cannot be resolved
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Summary: NameNode should fail when a DNS entry for a quorum node cannot be resolved (was: NameNode should fail on JournalNode DNS resolution failure) > NameNode should fail when a DNS entry for a quorum node cannot be resolved > -- > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-4957) NameNode failover should not fail because a DNS entry for a quorum node cannot be resolved
[ https://issues.apache.org/jira/browse/HDFS-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge reassigned HDFS-4957: Assignee: John Zhuge > NameNode failover should not fail because a DNS entry for a quorum node > cannot be resolved > -- > > Key: HDFS-4957 > URL: https://issues.apache.org/jira/browse/HDFS-4957 > Project: Hadoop HDFS > Issue Type: Bug > Components: qjm >Affects Versions: 2.3.0, 2.6.0 >Reporter: Colin P. McCabe >Assignee: John Zhuge > > When a StandbyNameNode is becoming active, we should not bail out because a > DNS entry for a quorum node cannot be resolved. Currently it does fail in > this scenario, with a message like this: > {code} > 2013-07-03 21:28:40,576 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services > required for active state > 2013-07-03 21:28:40,579 FATAL > org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring > NN shutdown. Shutting down immediately. > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://hadoop-mm:8485;hadoop-nn-0:8485;hadoop-nn-1:8485/hadoop > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1254) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:722) > > {code} > reported by Matt Bookman -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) NameNode should fail on JournalNode DNS resolution failure
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Summary: NameNode should fail on JournalNode DNS resolution failure (was: NameNode Format should not fail for DNS resolution on minority of JournalNode) > NameNode should fail on JournalNode DNS resolution failure > -- > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10684) WebHDFS DataNode calls fail when boolean parameters not provided
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446784#comment-15446784 ] John Zhuge commented on HDFS-10684: --- [~loungerdork], I was able to reproduce the problem with NN running 2.7.1 and DN running 2.8. HDFS-8435 (in 2.8 but not in 2.7) introduced new parameter {{createparent}}. > WebHDFS DataNode calls fail when boolean parameters not provided > > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Attachment: HDFS-4210.003.patch Patch 003: * Fix checkstyle {{TestRBWBlockInvalidation}} failure for 002 passes locally. > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, > HDFS-4210.003.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-4210: - Attachment: HDFS-4210.002.patch Patch 002: * Throw {{UnknownHostException}} earlier in {{getLoggerAddresses}} when a JN hostname can not be resolved * Use JUnit rule ExpectedException > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15444786#comment-15444786 ] John Zhuge edited comment on HDFS-4210 at 8/29/16 4:30 AM: --- Discovered that exception {{UnknownHostException}} already thrown and caught in the following call path earlier than the NPE call path listed in JIRA Description: {noformat} at org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:245) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:217) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.getLoggerAddresses(QuorumJournalManager.java:390) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:364) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105) at org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults.testUnresolvableHostNameFailsGracefully(TestQJMWithFaults.java:201) {noformat} {code:title='createSocketAddrForHost'} } catch (UnknownHostException e) { addr = InetSocketAddress.createUnresolved(host, port); } {code} {{createSocketAddrForHost}} swallows the UHE exception and creates an unsolved {{InetSocketAddress}}. Callers are supposed to check with {{isUnresolved}}. {{getLoggerAddresses}} is the earliest opportunity to throw UHE. was (Author: jzhuge): Discovered that exception {{UnknownHostException}} already thrown and caught in the earlier call path than the NPE call path listed in JIRA Description: {noformat} at org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:245) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:217) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.getLoggerAddresses(QuorumJournalManager.java:390) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:364) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105) at org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults.testUnresolvableHostNameFailsGracefully(TestQJMWithFaults.java:201) {noformat} {code:title='createSocketAddrForHost'} } catch (UnknownHostException e) { addr = InetSocketAddress.createUnresolved(host, port); } {code} {{createSocketAddrForHost}} swallows the UHE exception and creates an unsolved {{InetSocketAddress}}. Callers are supposed to check with {{isUnresolved}}. {{getLoggerAddresses}} is the earliest opportunity to throw UHE. > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
[jira] [Commented] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15444786#comment-15444786 ] John Zhuge commented on HDFS-4210: -- Discovered that exception {{UnknownHostException}} already thrown and caught in the earlier call path than the NPE call path listed in JIRA Description: {noformat} at org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:245) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:217) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.getLoggerAddresses(QuorumJournalManager.java:390) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:364) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105) at org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults.testUnresolvableHostNameFailsGracefully(TestQJMWithFaults.java:201) {noformat} {code:title='createSocketAddrForHost'} } catch (UnknownHostException e) { addr = InetSocketAddress.createUnresolved(host, port); } {code} {{createSocketAddrForHost}} swallows the UHE exception and creates an unsolved {{InetSocketAddress}}. Callers are supposed to check with {{isUnresolved}}. {{getLoggerAddresses}} is the earliest opportunity to throw UHE. > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sen
[jira] [Commented] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436050#comment-15436050 ] John Zhuge commented on HDFS-4210: -- [~clamb]/[~cwl], I am taking over the jira in order to push it over the finish line. Hope that is ok with you. > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
[ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge reassigned HDFS-4210: Assignee: John Zhuge (was: Charles Lamb) > NameNode Format should not fail for DNS resolution on minority of JournalNode > - > > Key: HDFS-4210 > URL: https://issues.apache.org/jira/browse/HDFS-4210 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, journal-node, namenode >Affects Versions: 2.6.0 >Reporter: Damien Hardy >Assignee: John Zhuge >Priority: Trivial > Labels: BB2015-05-TBR > Attachments: HDFS-4210.001.patch > > > Setting : > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > cdh4master01 and cdh4master02 JournalNode up and running, > cdh4worker03 not yet provisionning (no DNS entrie) > With : > `hadoop namenode -format` fails with : > 12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalArgumentException: Unable to construct journal, > qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233) > ... 5 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161) > at > org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93) > ... 10 more > I suggest that if quorum is up format should not fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission
[ https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433238#comment-15433238 ] John Zhuge commented on HDFS-10650: --- Done, [~rchiang] and [~xiaochen]. > DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory > permission > - > > Key: HDFS-10650 > URL: https://issues.apache.org/jira/browse/HDFS-10650 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch > > > These 2 DFSClient methods should use default directory permission to create a > directory. > {code:java} > public boolean mkdirs(String src, FsPermission permission, > boolean createParent) throws IOException { > if (permission == null) { > permission = FsPermission.getDefault(); > } > {code} > {code:java} > public boolean primitiveMkdir(String src, FsPermission absPermission, > boolean createParent) > throws IOException { > checkOpen(); > if (absPermission == null) { > absPermission = > FsPermission.getDefault().applyUMask(dfsClientConf.uMask); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission
[ https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10650: -- Release Note: If the caller does not supply a permission, DFSClient#mkdirs and DFSClient#primitiveMkdir will create a new directory with the default directory permission 00777 instead of 00666. > DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory > permission > - > > Key: HDFS-10650 > URL: https://issues.apache.org/jira/browse/HDFS-10650 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch > > > These 2 DFSClient methods should use default directory permission to create a > directory. > {code:java} > public boolean mkdirs(String src, FsPermission permission, > boolean createParent) throws IOException { > if (permission == null) { > permission = FsPermission.getDefault(); > } > {code} > {code:java} > public boolean primitiveMkdir(String src, FsPermission absPermission, > boolean createParent) > throws IOException { > checkOpen(); > if (absPermission == null) { > absPermission = > FsPermission.getDefault().applyUMask(dfsClientConf.uMask); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission
[ https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432296#comment-15432296 ] John Zhuge commented on HDFS-10650: --- [~rchiang], thanks for commenting on this jira. Which document to update? > DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory > permission > - > > Key: HDFS-10650 > URL: https://issues.apache.org/jira/browse/HDFS-10650 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch > > > These 2 DFSClient methods should use default directory permission to create a > directory. > {code:java} > public boolean mkdirs(String src, FsPermission permission, > boolean createParent) throws IOException { > if (permission == null) { > permission = FsPermission.getDefault(); > } > {code} > {code:java} > public boolean primitiveMkdir(String src, FsPermission absPermission, > boolean createParent) > throws IOException { > checkOpen(); > if (absPermission == null) { > absPermission = > FsPermission.getDefault().applyUMask(dfsClientConf.uMask); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416590#comment-15416590 ] John Zhuge commented on HDFS-8897: -- Thanks [~jojochuang] for the review and commit ! > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-8897-branch-2.006.patch, HDFS-8897.001.patch, > HDFS-8897.002.patch, HDFS-8897.003.patch, HDFS-8897.004.patch, > HDFS-8897.005.patch, HDFS-8897.006.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897-branch-2.006.patch Patch branch-2.006: * Backport to branch-2 with minor conflicts > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897-branch-2.006.patch, HDFS-8897.001.patch, > HDFS-8897.002.patch, HDFS-8897.003.patch, HDFS-8897.004.patch, > HDFS-8897.005.patch, HDFS-8897.006.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416100#comment-15416100 ] John Zhuge commented on HDFS-8897: -- Sure [~jojochuang]. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch, > HDFS-8897.006.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) ---
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: (was: run) > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, > run_compat_tests, run_unit_tests, test_plan.md > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: (was: unit_tests) > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, > run_compat_tests, run_unit_tests, test_plan.md > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: run_unit_tests run_compat_tests test_plan.md Upload updated test plan and scripts. > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, > run_compat_tests, run_unit_tests, test_plan.md > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: (was: test_plan.md) > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, > run_compat_tests, run_unit_tests, test_plan.md > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.006.patch Patch 006: * Fix checkstyle Unit test failures in 005 were unrelated. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch, > HDFS-8897.006.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) -
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.005.patch Patch 005: * Fix checkstyle > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413245#comment-15413245 ] John Zhuge commented on HDFS-8897: -- Test failures are unrelated. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.004.patch Patch 004: * Fix checkstyle * Extract reusable code into {{trimUri}} > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Status: Patch Available (was: In Progress) > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.a
[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories
[ https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413080#comment-15413080 ] John Zhuge commented on HDFS-10721: --- Awesome, the community is great ! > HDFS NFS Gateway - Exporting multiple Directories > -- > > Key: HDFS-10721 > URL: https://issues.apache.org/jira/browse/HDFS-10721 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Senthilkumar >Assignee: Senthilkumar >Priority: Minor > > Current HDFS NFS gateway Supports exporting only one Directory.. > Example : > > nfs.export.point > /user > > This property helps us to export particular directory .. > Code Block : > public RpcProgramMountd(NfsConfiguration config, > DatagramSocket registrationSocket, boolean allowInsecurePorts) > throws IOException { > // Note that RPC cache is not enabled > super("mountd", "localhost", config.getInt( > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY, > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1, > VERSION_3, registrationSocket, allowInsecurePorts); > exports = new ArrayList(); > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > this.hostsMatcher = NfsExports.getInstance(config); > this.mounts = Collections.synchronizedList(new ArrayList()); > UserGroupInformation.setConfiguration(config); > SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY, > NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY); > this.dfsClient = new DFSClient(NameNode.getAddress(config), config); > } > Export List: > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > Current Code is supporting only one directory to be exposed ... Based on our > example /user can be exported .. > Most of the production environment expects more number of directories should > be exported and the same can be mounted for different clients.. > Example: > > nfs.export.point > /user,/data/web_crawler,/app-logs > > Here i have three directories to be exposed .. > 1)/user > 2) /data/web_crawler > 3) /app-logs > This would help us to mount directories for particular client ( Say client A > wants to write data in /app-logs - Hadoop Admin can mount and handover to > clients ). > Please advise here.. Sorry if this feature is already implemented.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413066#comment-15413066 ] John Zhuge edited comment on HDFS-8897 at 8/9/16 6:59 AM: -- Unit test errors were unrelated. They were likely caused by test environment. The failed tests pass locally. was (Author: jzhuge): Unit test errors were unrelated. They were likely caused by test environment. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else {
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413066#comment-15413066 ] John Zhuge commented on HDFS-8897: -- Unit test errors were unrelated. They were likely caused by test environment. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) ---
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: (was: test_plan.md) > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, > test_plan.md, unit_tests > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: test_plan.md > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, > test_plan.md, unit_tests > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: test_plan.md Upload {{test_plan.md}}. > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, > test_plan.md, unit_tests > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-6962: - Attachment: unit_tests Uploaded script {{unit_tests}} that runs regression unit tests. > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, > unit_tests > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories
[ https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411428#comment-15411428 ] John Zhuge commented on HDFS-10721: --- You probably not a contributor. Please send a request to hdfs-...@hadoop.apache.org. All Hadoop mailing lists: https://hadoop.apache.org/mailing_lists.html. Please read: https://wiki.apache.org/hadoop/HowToContribute. > HDFS NFS Gateway - Exporting multiple Directories > -- > > Key: HDFS-10721 > URL: https://issues.apache.org/jira/browse/HDFS-10721 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Senthilkumar >Priority: Minor > > Current HDFS NFS gateway Supports exporting only one Directory.. > Example : > > nfs.export.point > /user > > This property helps us to export particular directory .. > Code Block : > public RpcProgramMountd(NfsConfiguration config, > DatagramSocket registrationSocket, boolean allowInsecurePorts) > throws IOException { > // Note that RPC cache is not enabled > super("mountd", "localhost", config.getInt( > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY, > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1, > VERSION_3, registrationSocket, allowInsecurePorts); > exports = new ArrayList(); > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > this.hostsMatcher = NfsExports.getInstance(config); > this.mounts = Collections.synchronizedList(new ArrayList()); > UserGroupInformation.setConfiguration(config); > SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY, > NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY); > this.dfsClient = new DFSClient(NameNode.getAddress(config), config); > } > Export List: > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > Current Code is supporting only one directory to be exposed ... Based on our > example /user can be exported .. > Most of the production environment expects more number of directories should > be exported and the same can be mounted for different clients.. > Example: > > nfs.export.point > /user,/data/web_crawler,/app-logs > > Here i have three directories to be exposed .. > 1)/user > 2) /data/web_crawler > 3) /app-logs > This would help us to mount directories for particular client ( Say client A > wants to write data in /app-logs - Hadoop Admin can mount and handover to > clients ). > Please advise here.. Sorry if this feature is already implemented.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories
[ https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411378#comment-15411378 ] John Zhuge commented on HDFS-10721: --- Can you click {{Assign to me}}? > HDFS NFS Gateway - Exporting multiple Directories > -- > > Key: HDFS-10721 > URL: https://issues.apache.org/jira/browse/HDFS-10721 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Senthilkumar >Priority: Minor > > Current HDFS NFS gateway Supports exporting only one Directory.. > Example : > > nfs.export.point > /user > > This property helps us to export particular directory .. > Code Block : > public RpcProgramMountd(NfsConfiguration config, > DatagramSocket registrationSocket, boolean allowInsecurePorts) > throws IOException { > // Note that RPC cache is not enabled > super("mountd", "localhost", config.getInt( > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY, > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1, > VERSION_3, registrationSocket, allowInsecurePorts); > exports = new ArrayList(); > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > this.hostsMatcher = NfsExports.getInstance(config); > this.mounts = Collections.synchronizedList(new ArrayList()); > UserGroupInformation.setConfiguration(config); > SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY, > NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY); > this.dfsClient = new DFSClient(NameNode.getAddress(config), config); > } > Export List: > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > Current Code is supporting only one directory to be exposed ... Based on our > example /user can be exported .. > Most of the production environment expects more number of directories should > be exported and the same can be mounted for different clients.. > Example: > > nfs.export.point > /user,/data/web_crawler,/app-logs > > Here i have three directories to be exposed .. > 1)/user > 2) /data/web_crawler > 3) /app-logs > This would help us to mount directories for particular client ( Say client A > wants to write data in /app-logs - Hadoop Admin can mount and handover to > clients ). > Please advise here.. Sorry if this feature is already implemented.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Status: In Progress (was: Patch Available) Look into checkstyle and unit test errors. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issu
[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories
[ https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409617#comment-15409617 ] John Zhuge commented on HDFS-10721: --- Sorry for the confusion: {{c_user}} is a new HDFS user with read-only access to {{/data}}, created specifically to provide a workaround. I should have named the user {{readonly_b_webapp}} :) Do agree with you on that export table like Unix NFSv3 or NFSv4 server gives the admin more controls. The export table probably should support allowed client list and export options per export point. > HDFS NFS Gateway - Exporting multiple Directories > -- > > Key: HDFS-10721 > URL: https://issues.apache.org/jira/browse/HDFS-10721 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Senthilkumar >Priority: Minor > > Current HDFS NFS gateway Supports exporting only one Directory.. > Example : > > nfs.export.point > /user > > This property helps us to export particular directory .. > Code Block : > public RpcProgramMountd(NfsConfiguration config, > DatagramSocket registrationSocket, boolean allowInsecurePorts) > throws IOException { > // Note that RPC cache is not enabled > super("mountd", "localhost", config.getInt( > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY, > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1, > VERSION_3, registrationSocket, allowInsecurePorts); > exports = new ArrayList(); > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > this.hostsMatcher = NfsExports.getInstance(config); > this.mounts = Collections.synchronizedList(new ArrayList()); > UserGroupInformation.setConfiguration(config); > SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY, > NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY); > this.dfsClient = new DFSClient(NameNode.getAddress(config), config); > } > Export List: > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > Current Code is supporting only one directory to be exposed ... Based on our > example /user can be exported .. > Most of the production environment expects more number of directories should > be exported and the same can be mounted for different clients.. > Example: > > nfs.export.point > /user,/data/web_crawler,/app-logs > > Here i have three directories to be exposed .. > 1)/user > 2) /data/web_crawler > 3) /app-logs > This would help us to mount directories for particular client ( Say client A > wants to write data in /app-logs - Hadoop Admin can mount and handover to > clients ). > Please advise here.. Sorry if this feature is already implemented.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories
[ https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409023#comment-15409023 ] John Zhuge commented on HDFS-10721: --- Yeah, the case seems pretty rare. How about adding static ID mapping {{b_webapp => c_user}} to {{/etc/nfs.map}} on NFS Gateway host? Assume {{c_user}} only has read-only access to {{/data}} in HDFS. > HDFS NFS Gateway - Exporting multiple Directories > -- > > Key: HDFS-10721 > URL: https://issues.apache.org/jira/browse/HDFS-10721 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Senthilkumar >Priority: Minor > > Current HDFS NFS gateway Supports exporting only one Directory.. > Example : > > nfs.export.point > /user > > This property helps us to export particular directory .. > Code Block : > public RpcProgramMountd(NfsConfiguration config, > DatagramSocket registrationSocket, boolean allowInsecurePorts) > throws IOException { > // Note that RPC cache is not enabled > super("mountd", "localhost", config.getInt( > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY, > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1, > VERSION_3, registrationSocket, allowInsecurePorts); > exports = new ArrayList(); > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > this.hostsMatcher = NfsExports.getInstance(config); > this.mounts = Collections.synchronizedList(new ArrayList()); > UserGroupInformation.setConfiguration(config); > SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY, > NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY); > this.dfsClient = new DFSClient(NameNode.getAddress(config), config); > } > Export List: > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > Current Code is supporting only one directory to be exposed ... Based on our > example /user can be exported .. > Most of the production environment expects more number of directories should > be exported and the same can be mounted for different clients.. > Example: > > nfs.export.point > /user,/data/web_crawler,/app-logs > > Here i have three directories to be exposed .. > 1)/user > 2) /data/web_crawler > 3) /app-logs > This would help us to mount directories for particular client ( Say client A > wants to write data in /app-logs - Hadoop Admin can mount and handover to > clients ). > Please advise here.. Sorry if this feature is already implemented.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories
[ https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408922#comment-15408922 ] John Zhuge commented on HDFS-10721: --- Hi [~senthilec566], thanks for reporting the use case and suggesting a solution. Could you try this workaround? * Use the default {{nfs.export.point}} which is {{/}} * Set up HDFS user mappings. See https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html#User_authentication_and_mapping. * Set proper permissions to the HDFS directories and files to restrict the client's access. See https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html. > HDFS NFS Gateway - Exporting multiple Directories > -- > > Key: HDFS-10721 > URL: https://issues.apache.org/jira/browse/HDFS-10721 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Senthilkumar >Priority: Minor > > Current HDFS NFS gateway Supports exporting only one Directory.. > Example : > > nfs.export.point > /user > > This property helps us to export particular directory .. > Code Block : > public RpcProgramMountd(NfsConfiguration config, > DatagramSocket registrationSocket, boolean allowInsecurePorts) > throws IOException { > // Note that RPC cache is not enabled > super("mountd", "localhost", config.getInt( > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY, > NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1, > VERSION_3, registrationSocket, allowInsecurePorts); > exports = new ArrayList(); > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > this.hostsMatcher = NfsExports.getInstance(config); > this.mounts = Collections.synchronizedList(new ArrayList()); > UserGroupInformation.setConfiguration(config); > SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY, > NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY); > this.dfsClient = new DFSClient(NameNode.getAddress(config), config); > } > Export List: > exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY, > NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT)); > Current Code is supporting only one directory to be exposed ... Based on our > example /user can be exported .. > Most of the production environment expects more number of directories should > be exported and the same can be mounted for different clients.. > Example: > > nfs.export.point > /user,/data/web_crawler,/app-logs > > Here i have three directories to be exposed .. > 1)/user > 2) /data/web_crawler > 3) /app-logs > This would help us to mount directories for particular client ( Say client A > wants to write data in /app-logs - Hadoop Admin can mount and handover to > clients ). > Please advise here.. Sorry if this feature is already implemented.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10683) Make class Token$PrivateToken private
[ https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404316#comment-15404316 ] John Zhuge commented on HDFS-10683: --- {{TestFileCorruption}} passes locally. The link for row {{unit}} is not valid (error 404). > Make class Token$PrivateToken private > - > > Key: HDFS-10683 > URL: https://issues.apache.org/jira/browse/HDFS-10683 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.9.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: fs, ha, security, security_token > Attachments: HDFS-10683.001.patch > > > Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing > an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. > Use a factory method instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10683) Make class Token$PrivateToken private
[ https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10683: -- Labels: fs ha security security_token (was: ) Affects Version/s: (was: 3.0.0-alpha2) 2.9.0 Target Version/s: 2.9.0, 3.0.0-alpha2 Status: Patch Available (was: Open) > Make class Token$PrivateToken private > - > > Key: HDFS-10683 > URL: https://issues.apache.org/jira/browse/HDFS-10683 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.9.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: fs, ha, security, security_token > Attachments: HDFS-10683.001.patch > > > Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing > an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. > Use a factory method instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10683) Make class Token$PrivateToken private
[ https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10683: -- Attachment: HDFS-10683.001.patch Patch 001: * Make class {{Token$PrivateToken}} private * Add method {{Token#privateClone}} to create a private clone of a public token * Replace all {{instanceof Token.PrivateToken}} and typecasting with polymophic methods > Make class Token$PrivateToken private > - > > Key: HDFS-10683 > URL: https://issues.apache.org/jira/browse/HDFS-10683 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha2 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10683.001.patch > > > Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing > an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. > Use a factory method instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10683) Make class Token$PrivateToken private
[ https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10683: -- Summary: Make class Token$PrivateToken private (was: Refactor class Token$PrivateToken) > Make class Token$PrivateToken private > - > > Key: HDFS-10683 > URL: https://issues.apache.org/jira/browse/HDFS-10683 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha2 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing > an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. > Use a factory method instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10683) Refactor class Token$PrivateToken
[ https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10683: -- Summary: Refactor class Token$PrivateToken (was: Refactor Token.PrivateToken) > Refactor class Token$PrivateToken > - > > Key: HDFS-10683 > URL: https://issues.apache.org/jira/browse/HDFS-10683 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha2 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing > an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. > Use a factory method instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Status: Patch Available (was: In Progress) > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additio
[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400795#comment-15400795 ] John Zhuge commented on HDFS-10703: --- Thanks [~yzhangal] for review and commit. > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0, 3.0.0-alpha1, 3.0 > > Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400414#comment-15400414 ] John Zhuge commented on HDFS-10703: --- Tested the new field on non-HA pseudo cluster and HA cluster. > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399776#comment-15399776 ] John Zhuge commented on HDFS-10703: --- Please note the new field on NN Web UI is somewhat similar to the same field on SNN if HA is not enabled. > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399731#comment-15399731 ] John Zhuge commented on HDFS-10703: --- Since it is a UI change, I attached a screen shot after the fix. > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10703: -- Attachment: NN Web UI with Patch 001.png > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10703: -- Status: Patch Available (was: In Progress) > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10703.001.patch > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10703: -- Attachment: HDFS-10703.001.patch Patch 001: * Add a new field {{Last Checkpoint Time}} to NN Web UI * Test it in a pseudo cluster > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10703.001.patch > > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
[ https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-10703 started by John Zhuge. - > HA NameNode Web UI should show last checkpoint time > --- > > Key: HDFS-10703 > URL: https://issues.apache.org/jira/browse/HDFS-10703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ui >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > After enabling HA, NameNode HA should show last checkpoint time in the Web UI > as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10703) HA NameNode Web UI should show last checkpoint time
John Zhuge created HDFS-10703: - Summary: HA NameNode Web UI should show last checkpoint time Key: HDFS-10703 URL: https://issues.apache.org/jira/browse/HDFS-10703 Project: Hadoop HDFS Issue Type: Improvement Components: ui Affects Versions: 2.6.0 Reporter: John Zhuge Assignee: John Zhuge Priority: Minor After enabling HA, NameNode HA should show last checkpoint time in the Web UI as the Secondary NameNode Web UI does. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398487#comment-15398487 ] John Zhuge commented on HDFS-9276: -- Thanks [~xiaochen]! > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, > HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationH
[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail when boolean parameters not provided
[ https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10684: -- Summary: WebHDFS DataNode calls fail when boolean parameters not provided (was: WebHDFS calls fail when boolean parameters not provided) > WebHDFS DataNode calls fail when boolean parameters not provided > > > Key: HDFS-10684 > URL: https://issues.apache.org/jira/browse/HDFS-10684 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Samuel Low >Assignee: John Zhuge > > Optional boolean parameters that are not provided in the URL cause the > WebHDFS create file command to fail. > curl -i -X PUT > "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false"; > Response: > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Expires: Fri, 15 Jul 2016 04:10:13 GMT > Date: Fri, 15 Jul 2016 04:10:13 GMT > Pragma: no-cache > Content-Type: application/octet-stream > Location: > http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false > Content-Length: 0 > Server: Jetty(6.1.26) > Following the redirect: > curl -i -X PUT -T MYFILE > "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false"; > Response: > HTTP/1.1 100 Continue > HTTP/1.1 400 Bad Request > Content-Type: application/json; charset=utf-8 > Content-Length: 162 > Connection: close > > {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed > to parse \"null\" to Boolean."}} > The problem can be circumvented by providing both "createparent" and > "overwrite" parameters. > However, this is not possible when I have no control over the WebHDFS calls, > e.g. Ambari and Hue have errors due to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission
[ https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398140#comment-15398140 ] John Zhuge commented on HDFS-10650: --- Thanks @xiao for review and commit. Thanks [~cnauroth] for review. > DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory > permission > - > > Key: HDFS-10650 > URL: https://issues.apache.org/jira/browse/HDFS-10650 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch > > > These 2 DFSClient methods should use default directory permission to create a > directory. > {code:java} > public boolean mkdirs(String src, FsPermission permission, > boolean createParent) throws IOException { > if (permission == null) { > permission = FsPermission.getDefault(); > } > {code} > {code:java} > public boolean primitiveMkdir(String src, FsPermission absPermission, > boolean createParent) > throws IOException { > checkOpen(); > if (absPermission == null) { > absPermission = > FsPermission.getDefault().applyUMask(dfsClientConf.uMask); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.003.patch Patch 003: * Fix checkstyle * Only sanitize defaultUri for hdfs scheme * Move new test code into a separate method because the original method is too long > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) --
[jira] [Comment Edited] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397791#comment-15397791 ] John Zhuge edited comment on HDFS-8897 at 7/28/16 4:43 PM: --- Fixing checkstyle and unt test failures. was (Author: jzhuge): Fixing checkstyle and TestBalancerWithMultipleNameNodes failure. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JI
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Status: In Progress (was: Patch Available) Fixing checkstyle and TestBalancerWithMultipleNameNodes failure. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.002.patch Patch 002: * Fix {{DFSUtil#getNameServiceUris}} to sanitize {{fs.defaultFS}} property * Update a unit test in {{TestDFSUtil}} > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) ---
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397282#comment-15397282 ] John Zhuge commented on HDFS-8897: -- Another criteria to hit this issue is that property {{dfs.namenode.rpc-address}} can not be set. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Summary: Balancer should handle fs.defaultFS trailing slash in HA (was: Balancer should handle fs.defaultFS with trailing slashes in HA) > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdf
[jira] [Commented] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396977#comment-15396977 ] John Zhuge commented on HDFS-6962: -- [~cnauroth] and [~eddyxu], have you got a chance to look at 009? I believe all major review issues are resolved. I plan to run all Hadoop unit tests twice: one with the flag off and one with the flag on. > ACLs inheritance conflict with umaskmode > > > Key: HDFS-6962 > URL: https://issues.apache.org/jira/browse/HDFS-6962 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: CentOS release 6.5 (Final) >Reporter: LINTE >Assignee: John Zhuge >Priority: Critical > Labels: hadoop, security > Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, > HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, > HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, > HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, > disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run > > > In hdfs-site.xml > > dfs.umaskmode > 027 > > 1/ Create a directory as superuser > bash# hdfs dfs -mkdir /tmp/ACLS > 2/ set default ACLs on this directory rwx access for group readwrite and user > toto > bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS > bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS > 3/ check ACLs /tmp/ACLS/ > bash# hdfs dfs -getfacl /tmp/ACLS/ > # file: /tmp/ACLS > # owner: hdfs > # group: hadoop > user::rwx > group::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > user::rwx | group::r-x | other::--- matches with the umaskmode defined in > hdfs-site.xml, everything ok ! > default:group:readwrite:rwx allow readwrite group with rwx access for > inhéritance. > default:user:toto:rwx allow toto user with rwx access for inhéritance. > default:mask::rwx inhéritance mask is rwx, so no mask > 4/ Create a subdir to test inheritance of ACL > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs > 5/ check ACLs /tmp/ACLS/hdfs > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs > # file: /tmp/ACLS/hdfs > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:r-x > group::r-x > group:readwrite:rwx #effective:r-x > mask::r-x > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > Here we can see that the readwrite group has rwx ACL bu only r-x is effective > because the mask is r-x (mask::r-x) in spite of default mask for inheritance > is set to default:mask::rwx on /tmp/ACLS/ > 6/ Modifiy hdfs-site.xml et restart namenode > > dfs.umaskmode > 010 > > 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode > bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 > 8/ Check ACL on /tmp/ACLS/hdfs2 > bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 > # file: /tmp/ACLS/hdfs2 > # owner: hdfs > # group: hadoop > user::rwx > user:toto:rwx #effective:rw- > group::r-x #effective:r-- > group:readwrite:rwx #effective:rw- > mask::rw- > other::--- > default:user::rwx > default:user:toto:rwx > default:group::r-x > default:group:readwrite:rwx > default:mask::rwx > default:other::--- > So HDFS masks the ACL value (user, group and other -- exepted the POSIX > owner -- ) with the group mask of dfs.umaskmode properties when creating > directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395750#comment-15395750 ] John Zhuge commented on HDFS-9276: -- Partially related. The patch fixes HDFS delegation token renewal bug In HA mode. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, > HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Summary: Balancer should handle fs.defaultFS with trailing slashes in HA (was: Balancer should handle fs.defaultFS with trailing slashes) > Balancer should handle fs.defaultFS with trailing slashes in HA > --- > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscri
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394912#comment-15394912 ] John Zhuge commented on HDFS-8897: -- [~jojochuang] Thanks for the review. bq. I played with the test case a bit, but it did not reproduce the exact symptom. (i.e. showing two namenodes in the log like The added section in test case only makes sure the {{namenodes}} collection passed to {{Balancer.run}} has the same number of entries as the normal test before. If we add the code to start the balancer with that {{namenodes}}, you will see {{Balancer: namenodes = [hdfs://sandbox/, hdfs://sandbox]}} without the fix. You think that is necessary, we can add it. bq. if fs.defaultFS is hdfs://sandbox/abc This case is tricky because {fs.defaultFS}} is valid but with a trailing slash, everything else works fine except Balancer, otherwise users would have detected the problem much earlier in other ways. > Balancer should handle fs.defaultFS with trailing slashes > - > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > fina
[jira] [Updated] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-9276: - Attachment: HDFS-9276.20.patch Patch 20: * Small change in {{Credentials.addToken}} Thanks [~xiaochen]. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, > HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.001.patch Patch 001: * Remove the trailing slashes from defaultUri > Balancer should handle fs.defaultFS with trailing slashes > - > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For addi
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Target Version/s: 3.0.0-alpha2 Status: Patch Available (was: In Progress) > Balancer should handle fs.defaultFS with trailing slashes > - > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For addit
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Summary: Balancer should handle fs.defaultFS with trailing slashes (was: Balancer should remove fs.defaultFS trailing slashes) > Balancer should handle fs.defaultFS with trailing slashes > - > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional
[jira] [Updated] (HDFS-8897) Balancer should remove fs.defaultFS trailing slashes
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Summary: Balancer should remove fs.defaultFS trailing slashes (was: Balancer should remove trailing slashes in fs.defaultFS) > Balancer should remove fs.defaultFS trailing slashes > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e
[jira] [Updated] (HDFS-8897) Balancer should remove trailing slashes in fs.defaultFS
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Summary: Balancer should remove trailing slashes in fs.defaultFS (was: Loadbalancer always exits with : java.io.IOException: Another Balancer is running.. Exiting ...) > Balancer should remove trailing slashes in fs.defaultFS > --- > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsub
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394159#comment-15394159 ] John Zhuge commented on HDFS-9276: -- TestBalancer and TestHttpServerLifecycle failures areunrelated and they pass on my local host. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFS-9276.19.patch, HDFSReadLoop.scala, debug1.PNG, > debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.Retry
[jira] [Updated] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-9276: - Attachment: HDFS-9276.19.patch Patch 19: * Fix the unit test bug * Bring back {{publicService}} related code Passed units. Passed {{HDFSReadLoop}} spark test. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFS-9276.19.patch, HDFSReadLoop.scala, debug1.PNG, > debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393181#comment-15393181 ] John Zhuge commented on HDFS-9276: -- And because of the wrong unit test, changes in 17 and 18 were not properly validated. It looks like {{publicService}} is needed for {{PrivateToken}} because it does need 2 services: 1 for lookup and 1 for refreshing private tokens. Learning from the mistake, I will test the next patch against my Spark program {{HDFSReadLoop.scala}} in addition to the unit test. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorIm
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393157#comment-15393157 ] John Zhuge commented on HDFS-9276: -- [~xiaochen], Thanks for the catch on the unit test! Patch 17 introduced the regression. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393046#comment-15393046 ] John Zhuge commented on HDFS-10620: --- Hi [~ajisakaa], when you merged the fix to branch-2 and branch-2.8, you misplaced "&&" with "&": {code} private void addToInvalidates(Block b) { ... if (datanodes != null & datanodes.length() != 0) { blockLog.debug("BLOCK* addToInvalidates: {} {}", b, datanodes); } } {code} This resulted in {{TestFailoverWithBlockTokensEnabled}} test failures. > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10620.001.patch, HDFS-10620.002.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392940#comment-15392940 ] John Zhuge commented on HDFS-9276: -- Test error in {{TestBalancerWithSaslDataTransfer.testBalancer0Integrity}} not related. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(