[jira] [Updated] (HDFS-10835) Typo in httpfs.sh hadoop_usage

2016-09-02 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10835:
--
Status: Patch Available  (was: Open)

> Typo in httpfs.sh hadoop_usage
> --
>
> Key: HDFS-10835
> URL: https://issues.apache.org/jira/browse/HDFS-10835
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: httpfs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
> Attachments: HDFS-10835.001.patch
>
>
> Typo in method {{hadoop_usage}} of {{httpfs.sh}}. The {{kms}} words should be 
> replaced with {{httpfs}}:
> {noformat}
> function hadoop_usage
> {
>   hadoop_add_subcommand "run" "Start kms in the current window"
>   hadoop_add_subcommand "run -security" "Start in the current window with 
> security manager"
>   hadoop_add_subcommand "start" "Start kms in a separate window"
>   hadoop_add_subcommand "start -security" "Start in a separate window with 
> security manager"
>   hadoop_add_subcommand "status" "Return the LSB compliant status"
>   hadoop_add_subcommand "stop" "Stop kms, waiting up to 5 seconds for the 
> process to end"
>   hadoop_add_subcommand "top n" "Stop kms, waiting up to n seconds for the 
> process to end"
>   hadoop_add_subcommand "stop -force" "Stop kms, wait up to 5 seconds and 
> then use kill -KILL if still running"
>   hadoop_add_subcommand "stop n -force" "Stop kms, wait up to n seconds and 
> then use kill -KILL if still running"
>   hadoop_generate_usage "${MYNAME}" false
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10835) Typo in httpfs.sh hadoop_usage

2016-09-02 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10835:
--
Attachment: HDFS-10835.001.patch

Patch 001:
* Replace words {{kms}} with {{https}}
* No unit test for startup script

Manual test output:
{noformat}
$ sbin/httpfs.sh -h
Usage: httpfs.sh [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]

  OPTIONS is none or any of:

--config dir   Hadoop config directory
--debugturn on shell script debug mode
--help usage information

  SUBCOMMAND is one of:

run -security Start in the current window with security manager
run   Start httpfs in the current window
start -security   Start in a separate window with security manager
start Start httpfs in a separate window
statusReturn the LSB compliant status
stop -force   Stop httpfs, wait up to 5 seconds and then use kill -KILL if 
still running
stop n -force Stop httpfs, wait up to n seconds and then use kill -KILL if 
still running
stop  Stop httpfs, waiting up to 5 seconds for the process to end
top n Stop httpfs, waiting up to n seconds for the process to end

SUBCOMMAND may print help when invoked w/o parameters or with -h.
{noformat}

> Typo in httpfs.sh hadoop_usage
> --
>
> Key: HDFS-10835
> URL: https://issues.apache.org/jira/browse/HDFS-10835
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: httpfs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
> Attachments: HDFS-10835.001.patch
>
>
> Typo in method {{hadoop_usage}} of {{httpfs.sh}}. The {{kms}} words should be 
> replaced with {{httpfs}}:
> {noformat}
> function hadoop_usage
> {
>   hadoop_add_subcommand "run" "Start kms in the current window"
>   hadoop_add_subcommand "run -security" "Start in the current window with 
> security manager"
>   hadoop_add_subcommand "start" "Start kms in a separate window"
>   hadoop_add_subcommand "start -security" "Start in a separate window with 
> security manager"
>   hadoop_add_subcommand "status" "Return the LSB compliant status"
>   hadoop_add_subcommand "stop" "Stop kms, waiting up to 5 seconds for the 
> process to end"
>   hadoop_add_subcommand "top n" "Stop kms, waiting up to n seconds for the 
> process to end"
>   hadoop_add_subcommand "stop -force" "Stop kms, wait up to 5 seconds and 
> then use kill -KILL if still running"
>   hadoop_add_subcommand "stop n -force" "Stop kms, wait up to n seconds and 
> then use kill -KILL if still running"
>   hadoop_generate_usage "${MYNAME}" false
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10835) Typo in httpfs.sh hadoop_usage

2016-09-02 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10835:
-

 Summary: Typo in httpfs.sh hadoop_usage
 Key: HDFS-10835
 URL: https://issues.apache.org/jira/browse/HDFS-10835
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: httpfs
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Trivial


Typo in method {{hadoop_usage}} of {{httpfs.sh}}. The {{kms}} words should be 
replaced with {{httpfs}}:
{noformat}
function hadoop_usage
{
  hadoop_add_subcommand "run" "Start kms in the current window"
  hadoop_add_subcommand "run -security" "Start in the current window with 
security manager"
  hadoop_add_subcommand "start" "Start kms in a separate window"
  hadoop_add_subcommand "start -security" "Start in a separate window with 
security manager"
  hadoop_add_subcommand "status" "Return the LSB compliant status"
  hadoop_add_subcommand "stop" "Stop kms, waiting up to 5 seconds for the 
process to end"
  hadoop_add_subcommand "top n" "Stop kms, waiting up to n seconds for the 
process to end"
  hadoop_add_subcommand "stop -force" "Stop kms, wait up to 5 seconds and then 
use kill -KILL if still running"
  hadoop_add_subcommand "stop n -force" "Stop kms, wait up to n seconds and 
then use kill -KILL if still running"
  hadoop_generate_usage "${MYNAME}" false
}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-02 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15459574#comment-15459574
 ] 

John Zhuge commented on HDFS-10684:
---

Thanks [~andrew.wang] for the comment. I am looking into unit testing to ensure 
compatibility.

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>Priority: Blocker
>  Labels: compatibility, webhdfs
> Attachments: HDFS-10684.001-branch-2.patch
>
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10822) Log DataNodes in the write pipeline

2016-09-02 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15459222#comment-15459222
 ] 

John Zhuge commented on HDFS-10822:
---

Thanks [~eddyxu] for the review and commit.

> Log DataNodes in the write pipeline
> ---
>
> Key: HDFS-10822
> URL: https://issues.apache.org/jira/browse/HDFS-10822
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: supportability
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-10822.001.patch
>
>
> Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did 
> not know which DNs were involved in the write pipeline. Of course, I could 
> search NN log for the list of DNs, but it might not be possible sometimes or 
> convenient.
> Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of 
> DNs in the pipeline.
> BTW, we do print the list of DNs during pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446784#comment-15446784
 ] 

John Zhuge edited comment on HDFS-10684 at 9/1/16 9:52 PM:
---

[~loungerdork], I was able to reproduce the problem with NN running 2.7.1 and 
DN running 2.8.

HDFS-8435 (in 2.8 but not in 2.7) introduced new parameter {{createparent}} to 
op {{CREATE}}.


was (Author: jzhuge):
[~loungerdork], I was able to reproduce the problem with NN running 2.7.1 and 
DN running 2.8.

HDFS-8435 (in 2.8 but not in 2.7) introduced new parameter {{createparent}}.

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>Priority: Blocker
>  Labels: compatibility, webhdfs
> Attachments: HDFS-10684.001-branch-2.patch
>
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15456003#comment-15456003
 ] 

John Zhuge edited comment on HDFS-10684 at 9/1/16 7:22 PM:
---

Patch 001-branch-2:
* Start with branch-2 patch because mixed version testing is only possible 
between 2.7 and branch-2.
* There is no unit testing due to the difficulty of mixed versions of NNs and 
DNs.
* Pass the JIRA test case manually between an 2.7 NN and a branch-2 DN.


was (Author: jzhuge):
Patch 001-branch-2:
* Start with branch-2 patch because mixed version testing is only probably 
between 2.7 and branch-2. Should merge the change to 2.8 and later.
* There is no unit testing due to the difficulty of mixed versions of NNs and 
DNs
* Pass the JIRA test case manually between an 2.7 NN and a branch-2 DN.

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>  Labels: compatibility
> Attachments: HDFS-10684.001-branch-2.patch
>
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10822) Log DataNodes in the write pipeline

2016-09-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10822:
--
Status: Patch Available  (was: Open)

> Log DataNodes in the write pipeline
> ---
>
> Key: HDFS-10822
> URL: https://issues.apache.org/jira/browse/HDFS-10822
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: supportability
> Attachments: HDFS-10822.001.patch
>
>
> Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did 
> not know which DNs were involved in the write pipeline. Of course, I could 
> search NN log for the list of DNs, but it might not be possible sometimes or 
> convenient.
> Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of 
> DNs in the pipeline.
> BTW, we do print the list of DNs during pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10822) Log DataNodes in the write pipeline

2016-09-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10822:
--
Attachment: HDFS-10822.001.patch

Patch 001:
* {{initDataStreaming}} is a better place to add DEBUG msg because the log msg 
will have {{file}}, {{block}}, and {{pipeline}} all together.
* No unit test because the patch adds DEBUG msg only.

{{TestDataStream}} log that shows the new DEBUG msg:
{noformat}
2016-09-01 11:54:18,470 [DataStreamer for file /file1 block 
BP-1484500550-172.16.1.255-1472756056162:blk_1073741825_1001] DEBUG 
hdfs.DataStreamer (DataStreamer.java:initDataStreaming(512)) - nodes 
[DatanodeInfoWithStorage[127.0.0.1:61765,DS-124a1018-6033-4f81-bc43-cba740bd9538,DISK]]
 storageTypes [DISK] storageIDs [DS-124a1018-6033-4f81-bc43-cba740bd9538]
{noformat}

> Log DataNodes in the write pipeline
> ---
>
> Key: HDFS-10822
> URL: https://issues.apache.org/jira/browse/HDFS-10822
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: supportability
> Attachments: HDFS-10822.001.patch
>
>
> Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did 
> not know which DNs were involved in the write pipeline. Of course, I could 
> search NN log for the list of DNs, but it might not be possible sometimes or 
> convenient.
> Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of 
> DNs in the pipeline.
> BTW, we do print the list of DNs during pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Status: Patch Available  (was: Open)

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>  Labels: compatibility
> Attachments: HDFS-10684.001-branch-2.patch
>
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Attachment: HDFS-10684.001-branch-2.patch

Patch 001-branch-2:
* Start with branch-2 patch because mixed version testing is only probably 
between 2.7 and branch-2. Should merge the change to 2.8 and later.
* There is no unit testing due to the difficulty of mixed versions of NNs and 
DNs
* Pass the JIRA test case manually between an 2.7 NN and a branch-2 DN.

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>  Labels: compatibility
> Attachments: HDFS-10684.001-branch-2.patch
>
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Labels: compatibility  (was: )

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>  Labels: compatibility
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Target Version/s: 2.8.0

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>  Labels: compatibility
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-09-01 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455957#comment-15455957
 ] 

John Zhuge commented on HDFS-10684:
---

To support mixed versions of NNs and DNs in webhdfs, we have to make sure 
{{null str}} is handled in {{XParam(final String str)}} constructors, 
either passing {{DEFAULT}} to superclass, or superclass already handling {{null 
str}}:
{code}
 public XParam(final String str) {
super(DOMAIN, DOMAIN.parse(str == null ? DEFAULT : str));
  }
{code}

A quick survey between 2.7 and branch-2.8 yields the following new params:
* CreateFlagParam
* NoRedirectParam
* StartAfterParam
* CreateParentParam, not new, but its usage expanded to {{CREATE}}

No new param added between branch-2.8 and trunk.

I don't know whether there is any other case like {{CreateParentParam}} where 
an existing param was expanded in its usage. If anybody comes across one, 
please let me know or file an JIRA.



> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10822) Log DataNodes in the write pipeline

2016-08-31 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453441#comment-15453441
 ] 

John Zhuge commented on HDFS-10822:
---

Thanks [~kihwal] for bringing up HDFS-8791.

> Log DataNodes in the write pipeline
> ---
>
> Key: HDFS-10822
> URL: https://issues.apache.org/jira/browse/HDFS-10822
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: supportability
>
> Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did 
> not know which DNs were involved in the write pipeline. Of course, I could 
> search NN log for the list of DNs, but it might not be possible sometimes or 
> convenient.
> Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of 
> DNs in the pipeline.
> BTW, we do print the list of DNs during pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10822) Log DataNodes in the write pipeline

2016-08-31 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10822:
-

 Summary: Log DataNodes in the write pipeline
 Key: HDFS-10822
 URL: https://issues.apache.org/jira/browse/HDFS-10822
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Trivial


Trying to diagnose a slow HDFS flush, taking longer than 10 seconds, but did 
not know which DNs were involved in the write pipeline. Of course, I could 
search NN log for the list of DNs, but it might not be possible sometimes or 
convenient.

Propose to add a DEBUG trace to DataStreamer#setPipeline to print the list of 
DNs in the pipeline.

BTW, we do print the list of DNs during pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) Helpful exception when DNS entry for JournalNode cannot be resolved

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Attachment: HDFS-4210.004.patch

Patch 004:
* Xiao's comments

> Helpful exception when DNS entry for JournalNode cannot be resolved
> ---
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch, HDFS-4210.004.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) Helpful exception when DNS entry for JournalNode cannot be resolved

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Summary: Helpful exception when DNS entry for JournalNode cannot be 
resolved  (was: NameNode should throw UnknownHostException when a DNS entry for 
a quorum node cannot be resolved)

> Helpful exception when DNS entry for JournalNode cannot be resolved
> ---
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) NameNode should throw UnknownHostException when a DNS entry for a quorum node cannot be resolved

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Summary: NameNode should throw UnknownHostException when a DNS entry for a 
quorum node cannot be resolved  (was: NameNode should fail when a DNS entry for 
a quorum node cannot be resolved)

> NameNode should throw UnknownHostException when a DNS entry for a quorum node 
> cannot be resolved
> 
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail when parameter createparent not provided

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Summary: WebHDFS DataNode calls fail when parameter createparent not 
provided  (was: WebHDFS DataNode calls fail when boolean parameters not 
provided)

> WebHDFS DataNode calls fail when parameter createparent not provided
> 
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail without parameter createparent

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Summary: WebHDFS DataNode calls fail without parameter createparent  (was: 
WebHDFS DataNode calls fail when parameter createparent not provided)

> WebHDFS DataNode calls fail without parameter createparent
> --
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-4210) NameNode should fail when a DNS entry for a quorum node cannot be resolved

2016-08-29 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446829#comment-15446829
 ] 

John Zhuge commented on HDFS-4210:
--

Change the title to reflect the intended fix and be consistent with HDFS-4957.

> NameNode should fail when a DNS entry for a quorum node cannot be resolved
> --
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) NameNode should fail when a DNS entry for a quorum node cannot be resolved

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Summary: NameNode should fail when a DNS entry for a quorum node cannot be 
resolved  (was: NameNode should fail on JournalNode DNS resolution failure)

> NameNode should fail when a DNS entry for a quorum node cannot be resolved
> --
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-4957) NameNode failover should not fail because a DNS entry for a quorum node cannot be resolved

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge reassigned HDFS-4957:


Assignee: John Zhuge

> NameNode failover should not fail because a DNS entry for a quorum node 
> cannot be resolved
> --
>
> Key: HDFS-4957
> URL: https://issues.apache.org/jira/browse/HDFS-4957
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Affects Versions: 2.3.0, 2.6.0
>Reporter: Colin P. McCabe
>Assignee: John Zhuge
>
> When a StandbyNameNode is becoming active, we should not bail out because a 
> DNS entry for a quorum node cannot be resolved.  Currently it does fail in 
> this scenario, with a message like this:
> {code}
> 2013-07-03 21:28:40,576 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2013-07-03 21:28:40,579 FATAL 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://hadoop-mm:8485;hadoop-nn-0:8485;hadoop-nn-1:8485/hadoop
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1254)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:722)
> 
> {code}
> reported by Matt Bookman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) NameNode should fail on JournalNode DNS resolution failure

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Summary: NameNode should fail on JournalNode DNS resolution failure  (was: 
NameNode Format should not fail for DNS resolution on minority of JournalNode)

> NameNode should fail on JournalNode DNS resolution failure
> --
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10684) WebHDFS DataNode calls fail when boolean parameters not provided

2016-08-29 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446784#comment-15446784
 ] 

John Zhuge commented on HDFS-10684:
---

[~loungerdork], I was able to reproduce the problem with NN running 2.7.1 and 
DN running 2.8.

HDFS-8435 (in 2.8 but not in 2.7) introduced new parameter {{createparent}}.

> WebHDFS DataNode calls fail when boolean parameters not provided
> 
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode

2016-08-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Attachment: HDFS-4210.003.patch

Patch 003:
* Fix checkstyle

{{TestRBWBlockInvalidation}} failure for 002 passes locally.

> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch, 
> HDFS-4210.003.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode

2016-08-28 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-4210:
-
Attachment: HDFS-4210.002.patch

Patch 002:
* Throw {{UnknownHostException}} earlier in {{getLoggerAddresses}} when a JN 
hostname can not be resolved
* Use JUnit rule ExpectedException

> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch, HDFS-4210.002.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode

2016-08-28 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15444786#comment-15444786
 ] 

John Zhuge edited comment on HDFS-4210 at 8/29/16 4:30 AM:
---

Discovered that exception {{UnknownHostException}} already thrown and caught in 
the following call path earlier than the NPE call path listed in JIRA 
Description:
{noformat}
  at 
org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:245)
  at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:217)
  at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.getLoggerAddresses(QuorumJournalManager.java:390)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:364)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
  at 
org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults.testUnresolvableHostNameFailsGracefully(TestQJMWithFaults.java:201)
{noformat}

{code:title='createSocketAddrForHost'}
} catch (UnknownHostException e) {
  addr = InetSocketAddress.createUnresolved(host, port);
}
{code}
{{createSocketAddrForHost}} swallows the UHE exception and creates an unsolved 
{{InetSocketAddress}}. Callers are supposed to check with {{isUnresolved}}.

{{getLoggerAddresses}} is the earliest opportunity to throw UHE.


was (Author: jzhuge):
Discovered that exception {{UnknownHostException}} already thrown and caught in 
the earlier call path than the NPE call path listed in JIRA Description:
{noformat}
  at 
org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:245)
  at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:217)
  at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.getLoggerAddresses(QuorumJournalManager.java:390)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:364)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
  at 
org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults.testUnresolvableHostNameFailsGracefully(TestQJMWithFaults.java:201)
{noformat}

{code:title='createSocketAddrForHost'}
} catch (UnknownHostException e) {
  addr = InetSocketAddress.createUnresolved(host, port);
}
{code}
{{createSocketAddrForHost}} swallows the UHE exception and creates an unsolved 
{{InetSocketAddress}}. Callers are supposed to check with {{isUnresolved}}.

{{getLoggerAddresses}} is the earliest opportunity to throw UHE.

> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)

[jira] [Commented] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode

2016-08-28 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15444786#comment-15444786
 ] 

John Zhuge commented on HDFS-4210:
--

Discovered that exception {{UnknownHostException}} already thrown and caught in 
the earlier call path than the NPE call path listed in JIRA Description:
{noformat}
  at 
org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:245)
  at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:217)
  at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.getLoggerAddresses(QuorumJournalManager.java:390)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:364)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
  at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
  at 
org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults.testUnresolvableHostNameFailsGracefully(TestQJMWithFaults.java:201)
{noformat}

{code:title='createSocketAddrForHost'}
} catch (UnknownHostException e) {
  addr = InetSocketAddress.createUnresolved(host, port);
}
{code}
{{createSocketAddrForHost}} swallows the UHE exception and creates an unsolved 
{{InetSocketAddress}}. Callers are supposed to check with {{isUnresolved}}.

{{getLoggerAddresses}} is the earliest opportunity to throw UHE.

> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sen

[jira] [Commented] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode

2016-08-24 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436050#comment-15436050
 ] 

John Zhuge commented on HDFS-4210:
--

[~clamb]/[~cwl], I am taking over the jira in order to push it over the finish 
line. Hope that is ok with you.

> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode

2016-08-24 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge reassigned HDFS-4210:


Assignee: John Zhuge  (was: Charles Lamb)

> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -
>
> Key: HDFS-4210
> URL: https://issues.apache.org/jira/browse/HDFS-4210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 2.6.0
>Reporter: Damien Hardy
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: BB2015-05-TBR
> Attachments: HDFS-4210.001.patch
>
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
>   ... 5 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:161)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:104)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:93)
>   ... 10 more
> I suggest that if quorum is up format should not fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission

2016-08-23 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433238#comment-15433238
 ] 

John Zhuge commented on HDFS-10650:
---

Done, [~rchiang] and [~xiaochen].

> DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory 
> permission
> -
>
> Key: HDFS-10650
> URL: https://issues.apache.org/jira/browse/HDFS-10650
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch
>
>
> These 2 DFSClient methods should use default directory permission to create a 
> directory.
> {code:java}
>   public boolean mkdirs(String src, FsPermission permission,
>   boolean createParent) throws IOException {
> if (permission == null) {
>   permission = FsPermission.getDefault();
> }
> {code}
> {code:java}
>   public boolean primitiveMkdir(String src, FsPermission absPermission, 
> boolean createParent)
> throws IOException {
> checkOpen();
> if (absPermission == null) {
>   absPermission = 
> FsPermission.getDefault().applyUMask(dfsClientConf.uMask);
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission

2016-08-23 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10650:
--
Release Note: If the caller does not supply a permission, DFSClient#mkdirs 
and DFSClient#primitiveMkdir will create a new directory with the default 
directory permission 00777 instead of 00666.

> DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory 
> permission
> -
>
> Key: HDFS-10650
> URL: https://issues.apache.org/jira/browse/HDFS-10650
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch
>
>
> These 2 DFSClient methods should use default directory permission to create a 
> directory.
> {code:java}
>   public boolean mkdirs(String src, FsPermission permission,
>   boolean createParent) throws IOException {
> if (permission == null) {
>   permission = FsPermission.getDefault();
> }
> {code}
> {code:java}
>   public boolean primitiveMkdir(String src, FsPermission absPermission, 
> boolean createParent)
> throws IOException {
> checkOpen();
> if (absPermission == null) {
>   absPermission = 
> FsPermission.getDefault().applyUMask(dfsClientConf.uMask);
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission

2016-08-23 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432296#comment-15432296
 ] 

John Zhuge commented on HDFS-10650:
---

[~rchiang], thanks for commenting on this jira. Which document to update?

> DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory 
> permission
> -
>
> Key: HDFS-10650
> URL: https://issues.apache.org/jira/browse/HDFS-10650
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch
>
>
> These 2 DFSClient methods should use default directory permission to create a 
> directory.
> {code:java}
>   public boolean mkdirs(String src, FsPermission permission,
>   boolean createParent) throws IOException {
> if (permission == null) {
>   permission = FsPermission.getDefault();
> }
> {code}
> {code:java}
>   public boolean primitiveMkdir(String src, FsPermission absPermission, 
> boolean createParent)
> throws IOException {
> checkOpen();
> if (absPermission == null) {
>   absPermission = 
> FsPermission.getDefault().applyUMask(dfsClientConf.uMask);
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-10 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416590#comment-15416590
 ] 

John Zhuge commented on HDFS-8897:
--

Thanks [~jojochuang] for the review and commit !

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-8897-branch-2.006.patch, HDFS-8897.001.patch, 
> HDFS-8897.002.patch, HDFS-8897.003.patch, HDFS-8897.004.patch, 
> HDFS-8897.005.patch, HDFS-8897.006.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards


[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-10 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897-branch-2.006.patch

Patch branch-2.006:
* Backport to branch-2 with minor conflicts

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897-branch-2.006.patch, HDFS-8897.001.patch, 
> HDFS-8897.002.patch, HDFS-8897.003.patch, HDFS-8897.004.patch, 
> HDFS-8897.005.patch, HDFS-8897.006.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.

[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-10 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416100#comment-15416100
 ] 

John Zhuge commented on HDFS-8897:
--

Sure [~jojochuang].

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch, 
> HDFS-8897.006.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---

[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-10 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: (was: run)

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, 
> run_compat_tests, run_unit_tests, test_plan.md
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-10 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: (was: unit_tests)

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, 
> run_compat_tests, run_unit_tests, test_plan.md
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-10 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: run_unit_tests
run_compat_tests
test_plan.md

Upload updated test plan and scripts.

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, 
> run_compat_tests, run_unit_tests, test_plan.md
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-10 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: (was: test_plan.md)

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, 
> run_compat_tests, run_unit_tests, test_plan.md
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897.006.patch

Patch 006:
* Fix checkstyle

Unit test failures in 005 were unrelated.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch, 
> HDFS-8897.006.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897.005.patch

Patch 005:
* Fix checkstyle

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-

[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-09 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413245#comment-15413245
 ] 

John Zhuge commented on HDFS-8897:
--

Test failures are unrelated.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch, HDFS-8897.004.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897.004.patch

Patch 004:
* Fix checkstyle
* Extract reusable code into {{trimUri}}

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch, HDFS-8897.004.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Status: Patch Available  (was: In Progress)

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch, HDFS-8897.004.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.a

[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories

2016-08-09 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413080#comment-15413080
 ] 

John Zhuge commented on HDFS-10721:
---

Awesome, the community is great !

> HDFS NFS Gateway - Exporting multiple Directories 
> --
>
> Key: HDFS-10721
> URL: https://issues.apache.org/jira/browse/HDFS-10721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Senthilkumar
>Assignee: Senthilkumar
>Priority: Minor
>
> Current HDFS NFS gateway Supports exporting only one Directory.. 
> Example :  
>
>   nfs.export.point
>   /user
>  
> This property helps us to export particular directory .. 
> Code Block : 
> public RpcProgramMountd(NfsConfiguration config,
>   DatagramSocket registrationSocket, boolean allowInsecurePorts)
>   throws IOException {
> // Note that RPC cache is not enabled
> super("mountd", "localhost", config.getInt(
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY,
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1,
> VERSION_3, registrationSocket, allowInsecurePorts);
> exports = new ArrayList();
>  exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> this.hostsMatcher = NfsExports.getInstance(config);
> this.mounts = Collections.synchronizedList(new ArrayList());
> UserGroupInformation.setConfiguration(config);
> SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY,
> NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY);
> this.dfsClient = new DFSClient(NameNode.getAddress(config), config);
>   }
> Export List:
> exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> Current Code is supporting only one directory to be exposed ... Based on our 
> example /user can be exported ..
> Most of the production environment expects more number of directories should 
> be exported and the same can be mounted for different clients.. 
> Example: 
> 
>   nfs.export.point
>   /user,/data/web_crawler,/app-logs
>  
> Here i have three directories to be exposed ..
> 1)/user
> 2)   /data/web_crawler
> 3)   /app-logs
> This would help us  to mount directories for particular client ( Say client A 
> wants to write data in /app-logs - Hadoop Admin can mount and handover to 
> clients  ).
> Please advise here..  Sorry if this feature is already implemented.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-09 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413066#comment-15413066
 ] 

John Zhuge edited comment on HDFS-8897 at 8/9/16 6:59 AM:
--

Unit test errors were unrelated. They were likely caused by test environment. 
The failed tests pass locally.


was (Author: jzhuge):
Unit test errors were unrelated. They were likely caused by test environment.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {

[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-08 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413066#comment-15413066
 ] 

John Zhuge commented on HDFS-8897:
--

Unit test errors were unrelated. They were likely caused by test environment.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---

[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-08 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: (was: test_plan.md)

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, 
> test_plan.md, unit_tests
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-08 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: test_plan.md

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, 
> test_plan.md, unit_tests
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-08 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: test_plan.md

Upload {{test_plan.md}}.

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, 
> test_plan.md, unit_tests
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-08-08 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-6962:
-
Attachment: unit_tests

Uploaded script {{unit_tests}} that runs regression unit tests.

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run, 
> unit_tests
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories

2016-08-08 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411428#comment-15411428
 ] 

John Zhuge commented on HDFS-10721:
---

You probably not a contributor. Please send a request to 
hdfs-...@hadoop.apache.org.

All Hadoop mailing lists: https://hadoop.apache.org/mailing_lists.html.

Please read: https://wiki.apache.org/hadoop/HowToContribute.

> HDFS NFS Gateway - Exporting multiple Directories 
> --
>
> Key: HDFS-10721
> URL: https://issues.apache.org/jira/browse/HDFS-10721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Senthilkumar
>Priority: Minor
>
> Current HDFS NFS gateway Supports exporting only one Directory.. 
> Example :  
>
>   nfs.export.point
>   /user
>  
> This property helps us to export particular directory .. 
> Code Block : 
> public RpcProgramMountd(NfsConfiguration config,
>   DatagramSocket registrationSocket, boolean allowInsecurePorts)
>   throws IOException {
> // Note that RPC cache is not enabled
> super("mountd", "localhost", config.getInt(
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY,
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1,
> VERSION_3, registrationSocket, allowInsecurePorts);
> exports = new ArrayList();
>  exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> this.hostsMatcher = NfsExports.getInstance(config);
> this.mounts = Collections.synchronizedList(new ArrayList());
> UserGroupInformation.setConfiguration(config);
> SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY,
> NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY);
> this.dfsClient = new DFSClient(NameNode.getAddress(config), config);
>   }
> Export List:
> exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> Current Code is supporting only one directory to be exposed ... Based on our 
> example /user can be exported ..
> Most of the production environment expects more number of directories should 
> be exported and the same can be mounted for different clients.. 
> Example: 
> 
>   nfs.export.point
>   /user,/data/web_crawler,/app-logs
>  
> Here i have three directories to be exposed ..
> 1)/user
> 2)   /data/web_crawler
> 3)   /app-logs
> This would help us  to mount directories for particular client ( Say client A 
> wants to write data in /app-logs - Hadoop Admin can mount and handover to 
> clients  ).
> Please advise here..  Sorry if this feature is already implemented.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories

2016-08-07 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411378#comment-15411378
 ] 

John Zhuge commented on HDFS-10721:
---

Can you click {{Assign to me}}?

> HDFS NFS Gateway - Exporting multiple Directories 
> --
>
> Key: HDFS-10721
> URL: https://issues.apache.org/jira/browse/HDFS-10721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Senthilkumar
>Priority: Minor
>
> Current HDFS NFS gateway Supports exporting only one Directory.. 
> Example :  
>
>   nfs.export.point
>   /user
>  
> This property helps us to export particular directory .. 
> Code Block : 
> public RpcProgramMountd(NfsConfiguration config,
>   DatagramSocket registrationSocket, boolean allowInsecurePorts)
>   throws IOException {
> // Note that RPC cache is not enabled
> super("mountd", "localhost", config.getInt(
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY,
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1,
> VERSION_3, registrationSocket, allowInsecurePorts);
> exports = new ArrayList();
>  exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> this.hostsMatcher = NfsExports.getInstance(config);
> this.mounts = Collections.synchronizedList(new ArrayList());
> UserGroupInformation.setConfiguration(config);
> SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY,
> NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY);
> this.dfsClient = new DFSClient(NameNode.getAddress(config), config);
>   }
> Export List:
> exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> Current Code is supporting only one directory to be exposed ... Based on our 
> example /user can be exported ..
> Most of the production environment expects more number of directories should 
> be exported and the same can be mounted for different clients.. 
> Example: 
> 
>   nfs.export.point
>   /user,/data/web_crawler,/app-logs
>  
> Here i have three directories to be exposed ..
> 1)/user
> 2)   /data/web_crawler
> 3)   /app-logs
> This would help us  to mount directories for particular client ( Say client A 
> wants to write data in /app-logs - Hadoop Admin can mount and handover to 
> clients  ).
> Please advise here..  Sorry if this feature is already implemented.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-08-07 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Status: In Progress  (was: Patch Available)

Look into checkstyle and unit test errors.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issu

[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories

2016-08-05 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409617#comment-15409617
 ] 

John Zhuge commented on HDFS-10721:
---

Sorry for the confusion: {{c_user}} is a new HDFS user with read-only access to 
{{/data}}, created specifically to provide a workaround. I should have named 
the user {{readonly_b_webapp}} :)

Do agree with you on that export table like Unix NFSv3 or NFSv4 server gives 
the admin more controls. The export table probably should support allowed 
client list and export options per export point.

> HDFS NFS Gateway - Exporting multiple Directories 
> --
>
> Key: HDFS-10721
> URL: https://issues.apache.org/jira/browse/HDFS-10721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Senthilkumar
>Priority: Minor
>
> Current HDFS NFS gateway Supports exporting only one Directory.. 
> Example :  
>
>   nfs.export.point
>   /user
>  
> This property helps us to export particular directory .. 
> Code Block : 
> public RpcProgramMountd(NfsConfiguration config,
>   DatagramSocket registrationSocket, boolean allowInsecurePorts)
>   throws IOException {
> // Note that RPC cache is not enabled
> super("mountd", "localhost", config.getInt(
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY,
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1,
> VERSION_3, registrationSocket, allowInsecurePorts);
> exports = new ArrayList();
>  exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> this.hostsMatcher = NfsExports.getInstance(config);
> this.mounts = Collections.synchronizedList(new ArrayList());
> UserGroupInformation.setConfiguration(config);
> SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY,
> NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY);
> this.dfsClient = new DFSClient(NameNode.getAddress(config), config);
>   }
> Export List:
> exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> Current Code is supporting only one directory to be exposed ... Based on our 
> example /user can be exported ..
> Most of the production environment expects more number of directories should 
> be exported and the same can be mounted for different clients.. 
> Example: 
> 
>   nfs.export.point
>   /user,/data/web_crawler,/app-logs
>  
> Here i have three directories to be exposed ..
> 1)/user
> 2)   /data/web_crawler
> 3)   /app-logs
> This would help us  to mount directories for particular client ( Say client A 
> wants to write data in /app-logs - Hadoop Admin can mount and handover to 
> clients  ).
> Please advise here..  Sorry if this feature is already implemented.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories

2016-08-05 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409023#comment-15409023
 ] 

John Zhuge commented on HDFS-10721:
---

Yeah, the case seems pretty rare. How about adding static ID mapping {{b_webapp 
=> c_user}} to {{/etc/nfs.map}} on NFS Gateway host? Assume {{c_user}} only has 
read-only access to {{/data}} in HDFS.

> HDFS NFS Gateway - Exporting multiple Directories 
> --
>
> Key: HDFS-10721
> URL: https://issues.apache.org/jira/browse/HDFS-10721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Senthilkumar
>Priority: Minor
>
> Current HDFS NFS gateway Supports exporting only one Directory.. 
> Example :  
>
>   nfs.export.point
>   /user
>  
> This property helps us to export particular directory .. 
> Code Block : 
> public RpcProgramMountd(NfsConfiguration config,
>   DatagramSocket registrationSocket, boolean allowInsecurePorts)
>   throws IOException {
> // Note that RPC cache is not enabled
> super("mountd", "localhost", config.getInt(
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY,
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1,
> VERSION_3, registrationSocket, allowInsecurePorts);
> exports = new ArrayList();
>  exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> this.hostsMatcher = NfsExports.getInstance(config);
> this.mounts = Collections.synchronizedList(new ArrayList());
> UserGroupInformation.setConfiguration(config);
> SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY,
> NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY);
> this.dfsClient = new DFSClient(NameNode.getAddress(config), config);
>   }
> Export List:
> exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> Current Code is supporting only one directory to be exposed ... Based on our 
> example /user can be exported ..
> Most of the production environment expects more number of directories should 
> be exported and the same can be mounted for different clients.. 
> Example: 
> 
>   nfs.export.point
>   /user,/data/web_crawler,/app-logs
>  
> Here i have three directories to be exposed ..
> 1)/user
> 2)   /data/web_crawler
> 3)   /app-logs
> This would help us  to mount directories for particular client ( Say client A 
> wants to write data in /app-logs - Hadoop Admin can mount and handover to 
> clients  ).
> Please advise here..  Sorry if this feature is already implemented.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10721) HDFS NFS Gateway - Exporting multiple Directories

2016-08-04 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408922#comment-15408922
 ] 

John Zhuge commented on HDFS-10721:
---

Hi [~senthilec566], thanks for reporting the use case and suggesting a solution.

Could you try this workaround?
* Use the default {{nfs.export.point}} which is {{/}}
* Set up HDFS user mappings. See 
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html#User_authentication_and_mapping.
* Set proper permissions to the HDFS directories and files to restrict the 
client's access. See 
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html.

> HDFS NFS Gateway - Exporting multiple Directories 
> --
>
> Key: HDFS-10721
> URL: https://issues.apache.org/jira/browse/HDFS-10721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Senthilkumar
>Priority: Minor
>
> Current HDFS NFS gateway Supports exporting only one Directory.. 
> Example :  
>
>   nfs.export.point
>   /user
>  
> This property helps us to export particular directory .. 
> Code Block : 
> public RpcProgramMountd(NfsConfiguration config,
>   DatagramSocket registrationSocket, boolean allowInsecurePorts)
>   throws IOException {
> // Note that RPC cache is not enabled
> super("mountd", "localhost", config.getInt(
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_KEY,
> NfsConfigKeys.DFS_NFS_MOUNTD_PORT_DEFAULT), PROGRAM, VERSION_1,
> VERSION_3, registrationSocket, allowInsecurePorts);
> exports = new ArrayList();
>  exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> this.hostsMatcher = NfsExports.getInstance(config);
> this.mounts = Collections.synchronizedList(new ArrayList());
> UserGroupInformation.setConfiguration(config);
> SecurityUtil.login(config, NfsConfigKeys.DFS_NFS_KEYTAB_FILE_KEY,
> NfsConfigKeys.DFS_NFS_KERBEROS_PRINCIPAL_KEY);
> this.dfsClient = new DFSClient(NameNode.getAddress(config), config);
>   }
> Export List:
> exports.add(config.get(NfsConfigKeys.DFS_NFS_EXPORT_POINT_KEY,
> NfsConfigKeys.DFS_NFS_EXPORT_POINT_DEFAULT));
> Current Code is supporting only one directory to be exposed ... Based on our 
> example /user can be exported ..
> Most of the production environment expects more number of directories should 
> be exported and the same can be mounted for different clients.. 
> Example: 
> 
>   nfs.export.point
>   /user,/data/web_crawler,/app-logs
>  
> Here i have three directories to be exposed ..
> 1)/user
> 2)   /data/web_crawler
> 3)   /app-logs
> This would help us  to mount directories for particular client ( Say client A 
> wants to write data in /app-logs - Hadoop Admin can mount and handover to 
> clients  ).
> Please advise here..  Sorry if this feature is already implemented.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10683) Make class Token$PrivateToken private

2016-08-02 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404316#comment-15404316
 ] 

John Zhuge commented on HDFS-10683:
---

{{TestFileCorruption}} passes locally. The link for row {{unit}} is not valid 
(error 404).

> Make class Token$PrivateToken private
> -
>
> Key: HDFS-10683
> URL: https://issues.apache.org/jira/browse/HDFS-10683
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: fs, ha, security, security_token
> Attachments: HDFS-10683.001.patch
>
>
> Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing 
> an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. 
> Use a factory method instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10683) Make class Token$PrivateToken private

2016-08-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10683:
--
   Labels: fs ha security security_token  (was: )
Affects Version/s: (was: 3.0.0-alpha2)
   2.9.0
 Target Version/s: 2.9.0, 3.0.0-alpha2
   Status: Patch Available  (was: Open)

> Make class Token$PrivateToken private
> -
>
> Key: HDFS-10683
> URL: https://issues.apache.org/jira/browse/HDFS-10683
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: fs, ha, security, security_token
> Attachments: HDFS-10683.001.patch
>
>
> Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing 
> an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. 
> Use a factory method instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10683) Make class Token$PrivateToken private

2016-08-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10683:
--
Attachment: HDFS-10683.001.patch

Patch 001:
* Make class {{Token$PrivateToken}} private
* Add method {{Token#privateClone}} to create a private clone of a public token
* Replace all {{instanceof Token.PrivateToken}} and typecasting with polymophic 
methods

> Make class Token$PrivateToken private
> -
>
> Key: HDFS-10683
> URL: https://issues.apache.org/jira/browse/HDFS-10683
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10683.001.patch
>
>
> Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing 
> an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. 
> Use a factory method instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10683) Make class Token$PrivateToken private

2016-08-01 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10683:
--
Summary: Make class Token$PrivateToken private  (was: Refactor class 
Token$PrivateToken)

> Make class Token$PrivateToken private
> -
>
> Key: HDFS-10683
> URL: https://issues.apache.org/jira/browse/HDFS-10683
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing 
> an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. 
> Use a factory method instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10683) Refactor class Token$PrivateToken

2016-07-31 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10683:
--
Summary: Refactor class Token$PrivateToken  (was: Refactor 
Token.PrivateToken)

> Refactor class Token$PrivateToken
> -
>
> Key: HDFS-10683
> URL: https://issues.apache.org/jira/browse/HDFS-10683
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> Avoid {{instanceof}} or typecasting of {{Toke.PrivateToken}} by introducing 
> an interface method in {{Token}}. Make class {{Toke.PrivateToken}} private. 
> Use a factory method instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-30 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Status: Patch Available  (was: In Progress)

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additio

[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-30 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400795#comment-15400795
 ] 

John Zhuge commented on HDFS-10703:
---

Thanks [~yzhangal] for review and commit.

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0, 3.0.0-alpha1, 3.0
>
> Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400414#comment-15400414
 ] 

John Zhuge commented on HDFS-10703:
---

Tested the new field on non-HA pseudo cluster and HA cluster.

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399776#comment-15399776
 ] 

John Zhuge commented on HDFS-10703:
---

Please note the new field on NN Web UI is somewhat similar to the same field on 
SNN if HA is not enabled.

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399731#comment-15399731
 ] 

John Zhuge commented on HDFS-10703:
---

Since it is a UI change, I attached a screen shot after the fix.

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10703:
--
Attachment: NN Web UI with Patch 001.png

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10703.001.patch, NN Web UI with Patch 001.png
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10703:
--
Status: Patch Available  (was: In Progress)

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10703.001.patch
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10703:
--
Attachment: HDFS-10703.001.patch

Patch 001:
* Add a new field {{Last Checkpoint Time}} to NN Web UI
* Test it in a pseudo cluster

> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10703.001.patch
>
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-10703 started by John Zhuge.
-
> HA NameNode Web UI should show last checkpoint time
> ---
>
> Key: HDFS-10703
> URL: https://issues.apache.org/jira/browse/HDFS-10703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
> as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10703) HA NameNode Web UI should show last checkpoint time

2016-07-29 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10703:
-

 Summary: HA NameNode Web UI should show last checkpoint time
 Key: HDFS-10703
 URL: https://issues.apache.org/jira/browse/HDFS-10703
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ui
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Minor


After enabling HA, NameNode HA should show last checkpoint time in the Web UI 
as the Secondary NameNode Web UI does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-28 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398487#comment-15398487
 ] 

John Zhuge commented on HDFS-9276:
--

Thanks [~xiaochen]!

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, 
> HDFSReadLoop.scala, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationH

[jira] [Updated] (HDFS-10684) WebHDFS DataNode calls fail when boolean parameters not provided

2016-07-28 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10684:
--
Summary: WebHDFS DataNode calls fail when boolean parameters not provided  
(was: WebHDFS calls fail when boolean parameters not provided)

> WebHDFS DataNode calls fail when boolean parameters not provided
> 
>
> Key: HDFS-10684
> URL: https://issues.apache.org/jira/browse/HDFS-10684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Samuel Low
>Assignee: John Zhuge
>
> Optional boolean parameters that are not provided in the URL cause the 
> WebHDFS create file command to fail.
> curl -i -X PUT 
> "http://hadoop-primarynamenode:50070/webhdfs/v1/tmp/test1234?op=CREATE&overwrite=false";
> Response:
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Expires: Fri, 15 Jul 2016 04:10:13 GMT
> Date: Fri, 15 Jul 2016 04:10:13 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Location: 
> http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false
> Content-Length: 0
> Server: Jetty(6.1.26)
> Following the redirect:
> curl -i -X PUT -T MYFILE 
> "http://hadoop-datanode1:50075/webhdfs/v1/tmp/test1234?op=CREATE&namenoderpcaddress=hadoop-primarynamenode:8020&overwrite=false";
> Response:
> HTTP/1.1 100 Continue
> HTTP/1.1 400 Bad Request
> Content-Type: application/json; charset=utf-8
> Content-Length: 162
> Connection: close
> 
> {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Failed
>  to parse \"null\" to Boolean."}}
> The problem can be circumvented by providing both "createparent" and 
> "overwrite" parameters.
> However, this is not possible when I have no control over the WebHDFS calls, 
> e.g. Ambari and Hue have errors due to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10650) DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory permission

2016-07-28 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398140#comment-15398140
 ] 

John Zhuge commented on HDFS-10650:
---

Thanks @xiao for review and commit. Thanks [~cnauroth] for review.

> DFSClient#mkdirs and DFSClient#primitiveMkdir should use default directory 
> permission
> -
>
> Key: HDFS-10650
> URL: https://issues.apache.org/jira/browse/HDFS-10650
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10650.001.patch, HDFS-10650.002.patch
>
>
> These 2 DFSClient methods should use default directory permission to create a 
> directory.
> {code:java}
>   public boolean mkdirs(String src, FsPermission permission,
>   boolean createParent) throws IOException {
> if (permission == null) {
>   permission = FsPermission.getDefault();
> }
> {code}
> {code:java}
>   public boolean primitiveMkdir(String src, FsPermission absPermission, 
> boolean createParent)
> throws IOException {
> checkOpen();
> if (absPermission == null) {
>   absPermission = 
> FsPermission.getDefault().applyUMask(dfsClientConf.uMask);
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-28 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897.003.patch

Patch 003:
* Fix checkstyle
* Only sanitize defaultUri for hdfs scheme
* Move new test code into a separate method because the original method is too 
long

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, 
> HDFS-8897.003.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

--

[jira] [Comment Edited] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-28 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397791#comment-15397791
 ] 

John Zhuge edited comment on HDFS-8897 at 7/28/16 4:43 PM:
---

Fixing checkstyle and unt test failures.


was (Author: jzhuge):
Fixing checkstyle and TestBalancerWithMultipleNameNodes failure.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JI

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-28 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Status: In Progress  (was: Patch Available)

Fixing checkstyle and TestBalancerWithMultipleNameNodes failure.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-28 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897.002.patch

Patch 002:
* Fix {{DFSUtil#getNameServiceUris}} to sanitize {{fs.defaultFS}} property
* Update a unit test in {{TestDFSUtil}}


> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---

[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-28 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397282#comment-15397282
 ] 

John Zhuge commented on HDFS-8897:
--

Another criteria to hit this issue is that property 
{{dfs.namenode.rpc-address}} can not be set.

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA

2016-07-28 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Summary: Balancer should handle fs.defaultFS trailing slash in HA  (was: 
Balancer should handle fs.defaultFS with trailing slashes in HA)

> Balancer should handle fs.defaultFS trailing slash in HA
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdf

[jira] [Commented] (HDFS-6962) ACLs inheritance conflict with umaskmode

2016-07-27 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396977#comment-15396977
 ] 

John Zhuge commented on HDFS-6962:
--

[~cnauroth] and [~eddyxu], have you got a chance to look at 009? I believe all 
major review issues are resolved.

I plan to run all Hadoop unit tests twice: one with the flag off and one with 
the flag on.

> ACLs inheritance conflict with umaskmode
> 
>
> Key: HDFS-6962
> URL: https://issues.apache.org/jira/browse/HDFS-6962
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: CentOS release 6.5 (Final)
>Reporter: LINTE
>Assignee: John Zhuge
>Priority: Critical
>  Labels: hadoop, security
> Attachments: HDFS-6962.001.patch, HDFS-6962.002.patch, 
> HDFS-6962.003.patch, HDFS-6962.004.patch, HDFS-6962.005.patch, 
> HDFS-6962.006.patch, HDFS-6962.007.patch, HDFS-6962.008.patch, 
> HDFS-6962.009.patch, HDFS-6962.1.patch, disabled_new_client.log, 
> disabled_old_client.log, enabled_new_client.log, enabled_old_client.log, run
>
>
> In hdfs-site.xml 
> 
> dfs.umaskmode
> 027
> 
> 1/ Create a directory as superuser
> bash# hdfs dfs -mkdir  /tmp/ACLS
> 2/ set default ACLs on this directory rwx access for group readwrite and user 
> toto
> bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
> bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
> 3/ check ACLs /tmp/ACLS/
> bash# hdfs dfs -getfacl /tmp/ACLS/
> # file: /tmp/ACLS
> # owner: hdfs
> # group: hadoop
> user::rwx
> group::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
> hdfs-site.xml, everything ok !
> default:group:readwrite:rwx allow readwrite group with rwx access for 
> inhéritance.
> default:user:toto:rwx allow toto user with rwx access for inhéritance.
> default:mask::rwx inhéritance mask is rwx, so no mask
> 4/ Create a subdir to test inheritance of ACL
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
> 5/ check ACLs /tmp/ACLS/hdfs
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
> # file: /tmp/ACLS/hdfs
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:r-x
> group::r-x
> group:readwrite:rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
> because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
> is set to default:mask::rwx on /tmp/ACLS/
> 6/ Modifiy hdfs-site.xml et restart namenode
> 
> dfs.umaskmode
> 010
> 
> 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
> bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
> 8/ Check ACL on /tmp/ACLS/hdfs2
> bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
> # file: /tmp/ACLS/hdfs2
> # owner: hdfs
> # group: hadoop
> user::rwx
> user:toto:rwx   #effective:rw-
> group::r-x  #effective:r--
> group:readwrite:rwx #effective:rw-
> mask::rw-
> other::---
> default:user::rwx
> default:user:toto:rwx
> default:group::r-x
> default:group:readwrite:rwx
> default:mask::rwx
> default:other::---
> So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
> owner -- ) with the group mask of dfs.umaskmode properties when creating 
> directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-27 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395750#comment-15395750
 ] 

John Zhuge commented on HDFS-9276:
--

Partially related. The patch fixes HDFS delegation token renewal bug In HA mode.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, 
> HDFSReadLoop.scala, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes in HA

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Summary: Balancer should handle fs.defaultFS with trailing slashes in HA  
(was: Balancer should handle fs.defaultFS with trailing slashes)

> Balancer should handle fs.defaultFS with trailing slashes in HA
> ---
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscri

[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes

2016-07-26 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394912#comment-15394912
 ] 

John Zhuge commented on HDFS-8897:
--

[~jojochuang] Thanks for the review.

bq. I played with the test case a bit, but it did not reproduce the exact 
symptom. (i.e. showing two namenodes in the log like

The added section in test case only makes sure the {{namenodes}} collection 
passed to {{Balancer.run}} has the same number of entries as the normal test 
before. If we add the code to start the balancer with that {{namenodes}}, you 
will see {{Balancer: namenodes = [hdfs://sandbox/, hdfs://sandbox]}} without 
the fix. You think that is necessary, we can add it.

bq. if fs.defaultFS is hdfs://sandbox/abc

This case is tricky because {fs.defaultFS}} is valid but with a trailing slash, 
everything else works fine except Balancer, otherwise users would have detected 
the problem much earlier in other ways.

> Balancer should handle fs.defaultFS with trailing slashes
> -
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   fina

[jira] [Updated] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-9276:
-
Attachment: HDFS-9276.20.patch

Patch 20:
* Small change in {{Credentials.addToken}}

Thanks [~xiaochen].

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, 
> HDFSReadLoop.scala, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Attachment: HDFS-8897.001.patch

Patch 001:
* Remove the trailing slashes from defaultUri

> Balancer should handle fs.defaultFS with trailing slashes
> -
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For addi

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Target Version/s: 3.0.0-alpha2
  Status: Patch Available  (was: In Progress)

> Balancer should handle fs.defaultFS with trailing slashes
> -
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
> Attachments: HDFS-8897.001.patch
>
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For addit

[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS with trailing slashes

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Summary: Balancer should handle fs.defaultFS with trailing slashes  (was: 
Balancer should remove fs.defaultFS trailing slashes)

> Balancer should handle fs.defaultFS with trailing slashes
> -
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional

[jira] [Updated] (HDFS-8897) Balancer should remove fs.defaultFS trailing slashes

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Summary: Balancer should remove fs.defaultFS trailing slashes  (was: 
Balancer should remove trailing slashes in fs.defaultFS)

> Balancer should remove fs.defaultFS trailing slashes
> 
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e

[jira] [Updated] (HDFS-8897) Balancer should remove trailing slashes in fs.defaultFS

2016-07-26 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-8897:
-
Summary: Balancer should remove trailing slashes in fs.defaultFS  (was: 
Loadbalancer always exits with : java.io.IOException: Another Balancer is 
running..  Exiting ...)

> Balancer should remove trailing slashes in fs.defaultFS
> ---
>
> Key: HDFS-8897
> URL: https://issues.apache.org/jira/browse/HDFS-8897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.7.1
> Environment: Centos 6.6
>Reporter: LINTE
>Assignee: John Zhuge
>
> When balancer is launched, it should test if there is already a 
> /system/balancer.id file in HDFS.
> When the file doesn't exist, the balancer don't want to run : 
> 15/08/14 16:35:12 INFO balancer.Balancer: namenodes  = [hdfs://sandbox/, 
> hdfs://sandbox]
> 15/08/14 16:35:12 INFO balancer.Balancer: parameters = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration 
> = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys
> 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> java.io.IOException: Another Balancer is running..  Exiting ...
> Aug 14, 2015 4:35:14 PM  Balancing took 2.408 seconds
> Looking at the audit log file when trying to run the balancer, the balancer 
> create the /system/balancer.id and then delete it on exiting ... 
> 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=create  
> src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r-  
> proto=rpc
> 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=getfileinfo 
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true   
> ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x   cmd=delete  
> src=/system/balancer.id dst=nullperm=null   proto=rpc
> The error seems to be located in 
> org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java 
> The function checkAndMarkRunning return null even if the /system/balancer.id 
> doesn't exist before entering this function; if it exists, then it is deleted 
> and the balancer exit with the same error.
> 
>   private OutputStream checkAndMarkRunning() throws IOException {
> try {
>   if (fs.exists(idPath)) {
> // try appending to it so that it will fail fast if another balancer 
> is
> // running.
> IOUtils.closeStream(fs.append(idPath));
> fs.delete(idPath, true);
>   }
>   final FSDataOutputStream fsout = fs.create(idPath, false);
>   // mark balancer idPath to be deleted during filesystem closure
>   fs.deleteOnExit(idPath);
>   if (write2IdFile) {
> fsout.writeBytes(InetAddress.getLocalHost().getHostName());
> fsout.hflush();
>   }
>   return fsout;
> } catch(RemoteException e) {
>   
> if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){
> return null;
>   } else {
> throw e;
>   }
> }
>   }
> 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsub

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-26 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394159#comment-15394159
 ] 

John Zhuge commented on HDFS-9276:
--

TestBalancer and TestHttpServerLifecycle failures areunrelated and they pass on 
my local host.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFS-9276.19.patch, HDFSReadLoop.scala, debug1.PNG, 
> debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.Retry

[jira] [Updated] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-25 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-9276:
-
Attachment: HDFS-9276.19.patch

Patch 19:
* Fix the unit test bug
* Bring back {{publicService}} related code

Passed units.
Passed {{HDFSReadLoop}} spark test.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFS-9276.19.patch, HDFSReadLoop.scala, debug1.PNG, 
> debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-25 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393181#comment-15393181
 ] 

John Zhuge commented on HDFS-9276:
--

And because of the wrong unit test, changes in 17 and 18 were not properly 
validated. It looks like {{publicService}} is needed for {{PrivateToken}} 
because it does need 2 services: 1 for lookup and 1 for refreshing private 
tokens.

Learning from the mistake, I will test the next patch against my Spark program 
{{HDFSReadLoop.scala}} in addition to the unit test.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFSReadLoop.scala, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorIm

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-25 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393157#comment-15393157
 ] 

John Zhuge commented on HDFS-9276:
--

[~xiaochen], Thanks for the catch on the unit test! Patch 17 introduced the 
regression.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFSReadLoop.scala, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod

[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-25 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393046#comment-15393046
 ] 

John Zhuge commented on HDFS-10620:
---

Hi [~ajisakaa], when you merged the fix to branch-2 and branch-2.8, you 
misplaced "&&" with "&":
{code}
  private void addToInvalidates(Block b) {
...
if (datanodes != null & datanodes.length() != 0) {
  blockLog.debug("BLOCK* addToInvalidates: {} {}", b, datanodes);
}
  }
{code}
This resulted in {{TestFailoverWithBlockTokensEnabled}} test failures.

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10620.001.patch, HDFS-10620.002.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-07-25 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392940#comment-15392940
 ] 

John Zhuge commented on HDFS-9276:
--

Test error in {{TestBalancerWithSaslDataTransfer.testBalancer0Integrity}} not 
related.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, 
> HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, 
> HDFS-9276.18.patch, HDFSReadLoop.scala, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(

<    1   2   3   4   5   6   7   8   9   10   >