[jira] [Commented] (HDFS-15923) RBF: Authentication failed when rename accross sub clusters

2021-05-05 Thread zhuobin zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340011#comment-17340011
 ] 

zhuobin zheng commented on HDFS-15923:
--

Hi, [~elgoiri] deleted thread.sleep(100) which is not needed at all.  (submit 
v03)
Please review again, thanks!

> RBF:  Authentication failed when rename accross sub clusters
> 
>
> Key: HDFS-15923
> URL: https://issues.apache.org/jira/browse/HDFS-15923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>  Labels: RBF, pull-request-available, rename
> Attachments: HDFS-15923.001.patch, HDFS-15923.002.patch, 
> HDFS-15923.003.patch, HDFS-15923.stack-trace, 
> hdfs-15923-fix-security-issue.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Rename accross subcluster with RBF and Kerberos environment. Will encounter 
> the following two errors:
>  # Save Object to journal.
>  # Precheck try to get src file status
> So, we need use Router Login UGI doAs create DistcpProcedure and 
> TrashProcedure and submit Job.
>  
> Beside, we should check user permission for src and dst path in router side 
> before do rename internal. (HDFS-15973)
> First: Save Object to journal.
> {code:java}
> // code placeholder
> 2021-03-23 14:01:16,233 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
> at org.apache.hadoop.ipc.Client.call(Client.java:1452)
> at org.apache.hadoop.ipc.Client.call(Client.java:1405)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> at com.sun.proxy.$Proxy11.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:376)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy12.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:277)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1240)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1219)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1201)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1139)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:533)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:530)
> at 
> 

[jira] [Commented] (HDFS-15923) RBF: Authentication failed when rename accross sub clusters

2021-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339998#comment-17339998
 ] 

Hadoop QA commented on HDFS-15923:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 33m 
48s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 2 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
15s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 12s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 
53s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
13s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} The patch has no ill-formed 
XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  6s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} the patch passed with JDK 

[jira] [Updated] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-16008:
---
Status: Patch Available  (was: Open)

> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008
> URL: https://issues.apache.org/jira/browse/HDFS-16008
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>
> This is a tool for initializing ViewFS Mapping to Router.
> Some companies are currently migrating from viewfs to router, I think they 
> need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-16008:
---
Description: 
This is a tool for initializing ViewFS Mapping to Router.
Some companies are currently migrating from viewfs to router, I think they need 
this tool.

  was:This is a tool for initializing ViewFS Mapping to Router.Some companies 
are currently migrating from viewfs to router, I think they need this tool.


> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008
> URL: https://issues.apache.org/jira/browse/HDFS-16008
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>
> This is a tool for initializing ViewFS Mapping to Router.
> Some companies are currently migrating from viewfs to router, I think they 
> need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-16008:
---
Summary: RBF: Tool to initialize ViewFS Mapping to Router  (was: RBF: This 
is a tool for initializing ViewFS Mapping to Router)

> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008
> URL: https://issues.apache.org/jira/browse/HDFS-16008
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>
> This is a tool for initializing ViewFS Mapping to Router.Some companies are 
> currently migrating from viewfs to router, I think they need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16008) RBF: This is a tool for initializing ViewFS Mapping to Router

2021-05-05 Thread zhu (Jira)
zhu created HDFS-16008:
--

 Summary: RBF: This is a tool for initializing ViewFS Mapping to 
Router
 Key: HDFS-16008
 URL: https://issues.apache.org/jira/browse/HDFS-16008
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.3.1
Reporter: zhu
Assignee: zhu


This is a tool for initializing ViewFS Mapping to Router.Some companies are 
currently migrating from viewfs to router, I think they need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15923) RBF: Authentication failed when rename accross sub clusters

2021-05-05 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HDFS-15923:
-
Attachment: HDFS-15923.003.patch
Status: Patch Available  (was: Open)

> RBF:  Authentication failed when rename accross sub clusters
> 
>
> Key: HDFS-15923
> URL: https://issues.apache.org/jira/browse/HDFS-15923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>  Labels: RBF, pull-request-available, rename
> Attachments: HDFS-15923.001.patch, HDFS-15923.002.patch, 
> HDFS-15923.003.patch, HDFS-15923.stack-trace, 
> hdfs-15923-fix-security-issue.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Rename accross subcluster with RBF and Kerberos environment. Will encounter 
> the following two errors:
>  # Save Object to journal.
>  # Precheck try to get src file status
> So, we need use Router Login UGI doAs create DistcpProcedure and 
> TrashProcedure and submit Job.
>  
> Beside, we should check user permission for src and dst path in router side 
> before do rename internal. (HDFS-15973)
> First: Save Object to journal.
> {code:java}
> // code placeholder
> 2021-03-23 14:01:16,233 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
> at org.apache.hadoop.ipc.Client.call(Client.java:1452)
> at org.apache.hadoop.ipc.Client.call(Client.java:1405)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> at com.sun.proxy.$Proxy11.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:376)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy12.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:277)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1240)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1219)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1201)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1139)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:533)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:530)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> 

[jira] [Updated] (HDFS-15923) RBF: Authentication failed when rename accross sub clusters

2021-05-05 Thread zhuobin zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuobin zheng updated HDFS-15923:
-
Status: Open  (was: Patch Available)

> RBF:  Authentication failed when rename accross sub clusters
> 
>
> Key: HDFS-15923
> URL: https://issues.apache.org/jira/browse/HDFS-15923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>  Labels: RBF, pull-request-available, rename
> Attachments: HDFS-15923.001.patch, HDFS-15923.002.patch, 
> HDFS-15923.stack-trace, hdfs-15923-fix-security-issue.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Rename accross subcluster with RBF and Kerberos environment. Will encounter 
> the following two errors:
>  # Save Object to journal.
>  # Precheck try to get src file status
> So, we need use Router Login UGI doAs create DistcpProcedure and 
> TrashProcedure and submit Job.
>  
> Beside, we should check user permission for src and dst path in router side 
> before do rename internal. (HDFS-15973)
> First: Save Object to journal.
> {code:java}
> // code placeholder
> 2021-03-23 14:01:16,233 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
> at org.apache.hadoop.ipc.Client.call(Client.java:1452)
> at org.apache.hadoop.ipc.Client.call(Client.java:1405)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> at com.sun.proxy.$Proxy11.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:376)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy12.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:277)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1240)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1219)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1201)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1139)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:533)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:530)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:544)
> at 
> 

[jira] [Commented] (HDFS-15934) Make DirectoryScanner reconcile blocks batch size and interval between batch configurable.

2021-05-05 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339963#comment-17339963
 ] 

Qi Zhu commented on HDFS-15934:
---

Thanks [~ayushtkn] for commit.

> Make DirectoryScanner reconcile blocks batch size and interval between batch 
> configurable.
> --
>
> Key: HDFS-15934
> URL: https://issues.apache.org/jira/browse/HDFS-15934
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> HDFS-14476 Make this batch to avoid lock too much time, but different cluster 
> has different demand, we should make batch size and batch interval 
> configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16007) Vulnerabilities found when serializing enum value

2021-05-05 Thread junwen yang (Jira)
junwen yang created HDFS-16007:
--

 Summary: Vulnerabilities found when serializing enum value
 Key: HDFS-16007
 URL: https://issues.apache.org/jira/browse/HDFS-16007
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: junwen yang


ReplicaState enum is using ordinal to conduct serialization and 
deserialization, which is vulnerable to the order, to cause issues similar to 
HDFS-15624.

To avoid it, either adding comments to let later developer not to change this 
enum, or add index checking in the read and getState function to avoid index 
out of bound error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-05 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16001:
-
Status: Patch Available  (was: Open)

Created a PR: https://github.com/apache/hadoop/pull/2980

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-05 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HDFS-16001:


Assignee: Akira Ajisaka

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339535#comment-17339535
 ] 

Viraj Jasani edited comment on HDFS-15982 at 5/5/21, 6:55 PM:
--

{quote}Regarding the UI. If the trash interval isn't set and If I select 
NO(move to trash), It still deletes with success? Check if the behaviour is 
like that, The client may be in wrong impression that things moved to trash, 
but it actually didn't. We should have bugged him back, Trash isn't enabled.
{quote}
If trash interval isn't set and if we select NO, it does delete with success 
(as per logic 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1560]:
 file moves to trash only if skiptrash is false and trashInterval > 0)
{code:java}
case DELETE: {
  Configuration conf =
  (Configuration) context.getAttribute(JspHelper.CURRENT_CONF);
  long trashInterval =
  conf.getLong(FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT);
  if (trashInterval > 0 && !skipTrash.getValue()) {
LOG.info("{} is {} , trying to archive {} instead of removing",
FS_TRASH_INTERVAL_KEY, trashInterval, fullpath);
org.apache.hadoop.fs.Path path =
new org.apache.hadoop.fs.Path(fullpath);
Configuration clonedConf = new Configuration(conf);
// To avoid caching FS objects and prevent OOM issues
clonedConf.set("fs.hdfs.impl.disable.cache", "true");
FileSystem fs = FileSystem.get(clonedConf);
boolean movedToTrash = Trash.moveToAppropriateTrash(fs, path,
clonedConf);
if (movedToTrash) {
  final String js = JsonUtil.toJsonString("boolean", true);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}
// Same is the behavior with Delete shell command.
// If moveToAppropriateTrash() returns false, file deletion
// is attempted rather than throwing Error.
LOG.debug("Could not move {} to Trash, attempting removal", fullpath);
  }
  final boolean b = cp.delete(fullpath, recursive.getValue());
  final String js = JsonUtil.toJsonString("boolean", b);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}

{code}
I think in UI, we can provide additional info in same model: "These buttons are 
useful only if fs.trash.interval has been configured. Without setting interval, 
files will be hard deleted anyways."

I just saw HDFS-14117 on using trash on RBF: "delete the files or dirs of one 
subcluster in a cluster with multiple subclusters".

 
{quote}Router would also have issues, if the trash path resolves to different 
NS, or to some path which isn't in the Mount Table, when Default Namespace 
isn't configured.
{quote}
I tried to explore if we can replace everything that 
Trash.moveToAppropriateTrash() offers with what ClientProtocol provides (that 
way router and mount table resolution would be taken care of) but it seems 
almost impossible to replace all FileSystem utilities used by Trash.


was (Author: vjasani):
{quote}Regarding the UI. If the trash interval isn't set and If I select 
NO(move to trash), It still deletes with success? Check if the behaviour is 
like that, The client may be in wrong impression that things moved to trash, 
but it actually didn't. We should have bugged him back, Trash isn't enabled.
{quote}
If trash interval isn't set and if we select NO, it does delete with success 
(as per logic 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1560]:
 file moves to trash only if skiptrash is false and trashInterval > 0)
{code:java}
case DELETE: {
  Configuration conf =
  (Configuration) context.getAttribute(JspHelper.CURRENT_CONF);
  long trashInterval =
  conf.getLong(FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT);
  if (trashInterval > 0 && !skipTrash.getValue()) {
LOG.info("{} is {} , trying to archive {} instead of removing",
FS_TRASH_INTERVAL_KEY, trashInterval, fullpath);
org.apache.hadoop.fs.Path path =
new org.apache.hadoop.fs.Path(fullpath);
Configuration clonedConf = new Configuration(conf);
// To avoid caching FS objects and prevent OOM issues
clonedConf.set("fs.hdfs.impl.disable.cache", "true");
FileSystem fs = FileSystem.get(clonedConf);
boolean movedToTrash = Trash.moveToAppropriateTrash(fs, path,
clonedConf);
if (movedToTrash) {
  final String js = JsonUtil.toJsonString("boolean", true);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}
// Same is the behavior with Delete shell command.
// If moveToAppropriateTrash() returns false, file deletion
// is attempted rather than throwing Error.
LOG.debug("Could not move {} to Trash, attempting removal", 

[jira] [Commented] (HDFS-15934) Make DirectoryScanner reconcile blocks batch size and interval between batch configurable.

2021-05-05 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339715#comment-17339715
 ] 

Ayush Saxena commented on HDFS-15934:
-

Committed to trunk. 
Thanx [~zhuqi] for the contribution!!!

> Make DirectoryScanner reconcile blocks batch size and interval between batch 
> configurable.
> --
>
> Key: HDFS-15934
> URL: https://issues.apache.org/jira/browse/HDFS-15934
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> HDFS-14476 Make this batch to avoid lock too much time, but different cluster 
> has different demand, we should make batch size and batch interval 
> configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15934) Make DirectoryScanner reconcile blocks batch size and interval between batch configurable.

2021-05-05 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15934:

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Make DirectoryScanner reconcile blocks batch size and interval between batch 
> configurable.
> --
>
> Key: HDFS-15934
> URL: https://issues.apache.org/jira/browse/HDFS-15934
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> HDFS-14476 Make this batch to avoid lock too much time, but different cluster 
> has different demand, we should make batch size and batch interval 
> configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization

2021-05-05 Thread zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339562#comment-17339562
 ] 

zhu commented on HDFS-16000:


[~daryn] Thanks for your comments. Please review [rename performance 
optimization|https://github.com/apache/hadoop/pull/2964].

> HDFS : Rename performance optimization
> --
>
> Key: HDFS-16000
> URL: https://issues.apache.org/jira/browse/HDFS-16000
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Affects Versions: 3.1.4, 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, 
> HDFS-16000.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It takes a long time to move a large directory with rename. For example, it 
> takes about 40 seconds to move a 1000W directory. When a large amount of data 
> is deleted to the trash, the move large directory will occur when the recycle 
> bin makes checkpoint. In addition, the user may also actively trigger the 
> move large directory operation, which will cause the NameNode to lock too 
> long and be killed by Zkfc. Through the flame graph, it is found that the 
> main time consuming is to create the EnumCounters object.
> h3. I think the following two points can optimize the efficiency of rename 
> execution
> h3. QuotaCount calculation time-consuming optimization:
>  * Create a QuotaCounts object in the calculation directory quotaCount, and 
> pass the quotaCount to the next calculation function through a parameter each 
> time, so as to avoid creating an EnumCounters object for each calculation.
>  * In addition, through the flame graph, it is found that using lambda to 
> modify QuotaCounts takes longer than the ordinary method, so the ordinary 
> method is used to modify the QuotaCounts count.
> h3. Rename logic optimization:
>  * Regardless of whether the rename operation is the source directory and the 
> target directory, the quota count must be calculated three times. The first 
> time, check whether the moved directory exceeds the target directory quota, 
> the second time, calculate the mobile directory quota to update the source 
> directory quota, and the third time, calculate the mobile directory 
> configuration update to the target directory.
>  * I think some of the above three quota quota calculations are unnecessary. 
> For example, if all parent directories of the source directory and target 
> directory are not configured with quota, there is no need to calculate 
> quotaCount. Even if both the source directory and the target directory use 
> quota, there is no need to calculate the quota three times. The calculation 
> logic for the first and third times is the same, and it only needs to be 
> calculated once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339535#comment-17339535
 ] 

Viraj Jasani edited comment on HDFS-15982 at 5/5/21, 9:35 AM:
--

{quote}Regarding the UI. If the trash interval isn't set and If I select 
NO(move to trash), It still deletes with success? Check if the behaviour is 
like that, The client may be in wrong impression that things moved to trash, 
but it actually didn't. We should have bugged him back, Trash isn't enabled.
{quote}
If trash interval isn't set and if we select NO, it does delete with success 
(as per logic 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1560]:
 file moves to trash only if skiptrash is false and trashInterval > 0)
{code:java}
case DELETE: {
  Configuration conf =
  (Configuration) context.getAttribute(JspHelper.CURRENT_CONF);
  long trashInterval =
  conf.getLong(FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT);
  if (trashInterval > 0 && !skipTrash.getValue()) {
LOG.info("{} is {} , trying to archive {} instead of removing",
FS_TRASH_INTERVAL_KEY, trashInterval, fullpath);
org.apache.hadoop.fs.Path path =
new org.apache.hadoop.fs.Path(fullpath);
Configuration clonedConf = new Configuration(conf);
// To avoid caching FS objects and prevent OOM issues
clonedConf.set("fs.hdfs.impl.disable.cache", "true");
FileSystem fs = FileSystem.get(clonedConf);
boolean movedToTrash = Trash.moveToAppropriateTrash(fs, path,
clonedConf);
if (movedToTrash) {
  final String js = JsonUtil.toJsonString("boolean", true);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}
// Same is the behavior with Delete shell command.
// If moveToAppropriateTrash() returns false, file deletion
// is attempted rather than throwing Error.
LOG.debug("Could not move {} to Trash, attempting removal", fullpath);
  }
  final boolean b = cp.delete(fullpath, recursive.getValue());
  final String js = JsonUtil.toJsonString("boolean", b);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}

{code}
I think in UI, we can provide additional info in same model: "These buttons are 
useful only if fs.trash.interval has been configured. Without setting interval, 
files will be hard deleted anyways."

I just saw HDFS-14117 on using trash on RBF: "delete the files or dirs of one 
subcluster in a cluster with multiple subclusters".


was (Author: vjasani):
{quote}Regarding the UI. If the trash interval isn't set and If I select 
NO(move to trash), It still deletes with success? Check if the behaviour is 
like that, The client may be in wrong impression that things moved to trash, 
but it actually didn't. We should have bugged him back, Trash isn't enabled.
{quote}
If trash interval isn't set and if we select NO, it does delete with success 
(as per logic 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1560]:
 file moves to trash only if skiptrash is false and trashInterval > 0)
{code:java}
case DELETE: {
  Configuration conf =
  (Configuration) context.getAttribute(JspHelper.CURRENT_CONF);
  long trashInterval =
  conf.getLong(FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT);
  if (trashInterval > 0 && !skipTrash.getValue()) {
LOG.info("{} is {} , trying to archive {} instead of removing",
FS_TRASH_INTERVAL_KEY, trashInterval, fullpath);
org.apache.hadoop.fs.Path path =
new org.apache.hadoop.fs.Path(fullpath);
Configuration clonedConf = new Configuration(conf);
// To avoid caching FS objects and prevent OOM issues
clonedConf.set("fs.hdfs.impl.disable.cache", "true");
FileSystem fs = FileSystem.get(clonedConf);
boolean movedToTrash = Trash.moveToAppropriateTrash(fs, path,
clonedConf);
if (movedToTrash) {
  final String js = JsonUtil.toJsonString("boolean", true);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}
// Same is the behavior with Delete shell command.
// If moveToAppropriateTrash() returns false, file deletion
// is attempted rather than throwing Error.
LOG.debug("Could not move {} to Trash, attempting removal", fullpath);
  }
  final boolean b = cp.delete(fullpath, recursive.getValue());
  final String js = JsonUtil.toJsonString("boolean", b);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}

{code}
I think in UI, we can provide additional info in same model: "These buttons are 
useful only if fs.trash.interval has been configured. Without setting interval, 
files will be hard deleted anyways."

> Deleted data using HTTP API should be saved to the 

[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339535#comment-17339535
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}Regarding the UI. If the trash interval isn't set and If I select 
NO(move to trash), It still deletes with success? Check if the behaviour is 
like that, The client may be in wrong impression that things moved to trash, 
but it actually didn't. We should have bugged him back, Trash isn't enabled.
{quote}
If trash interval isn't set and if we select NO, it does delete with success 
(as per logic 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1560]:
 file moves to trash only if skiptrash is false and trashInterval > 0)
{code:java}
case DELETE: {
  Configuration conf =
  (Configuration) context.getAttribute(JspHelper.CURRENT_CONF);
  long trashInterval =
  conf.getLong(FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT);
  if (trashInterval > 0 && !skipTrash.getValue()) {
LOG.info("{} is {} , trying to archive {} instead of removing",
FS_TRASH_INTERVAL_KEY, trashInterval, fullpath);
org.apache.hadoop.fs.Path path =
new org.apache.hadoop.fs.Path(fullpath);
Configuration clonedConf = new Configuration(conf);
// To avoid caching FS objects and prevent OOM issues
clonedConf.set("fs.hdfs.impl.disable.cache", "true");
FileSystem fs = FileSystem.get(clonedConf);
boolean movedToTrash = Trash.moveToAppropriateTrash(fs, path,
clonedConf);
if (movedToTrash) {
  final String js = JsonUtil.toJsonString("boolean", true);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}
// Same is the behavior with Delete shell command.
// If moveToAppropriateTrash() returns false, file deletion
// is attempted rather than throwing Error.
LOG.debug("Could not move {} to Trash, attempting removal", fullpath);
  }
  final boolean b = cp.delete(fullpath, recursive.getValue());
  final String js = JsonUtil.toJsonString("boolean", b);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}

{code}
I think in UI, we can provide additional info in same model: "These buttons are 
useful only if fs.trash.interval has been configured. Without setting interval, 
files will be hard deleted anyways."

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339518#comment-17339518
 ] 

Ayush Saxena commented on HDFS-15982:
-

DefaultFs is like if you don't specify the fs in the path it will get used, if 
the path has it, it won't get used. So, that isn't a mandatory conf itself. In 
the present case also. The WebHdfs is using the HTTP URI to reach to Namenode, 
then I think the DefaultFs points to the RPC. The URI is changing, it isn't the 
same FileSystem URI on which the call was made. (Just Tweaking a test).
{code:java}
2021-05-05 13:04:26,933 [Listener at localhost/50984] ERROR web.TestWebHDFS 
(TestWebHDFS.java:testWebHdfsNoRedirect(1613)) - URI of the FileSystem during 
call webhdfs://localhost:50970

2021-05-05 13:04:34,513 [IPC Server handler 8 on default port 50971] ERROR 
resources.NamenodeWebHdfsMethods (NamenodeWebHdfsMethods.java:delete(1574)) - 
URI of the FileSystem created for Trash: hdfs://localhost:50971
{code}
In the Namenode it would be there, but there are cases like
We need to check how things behave if the underlying DefaultFs is 
{{ViewDistributedFileSystem}} or is using {{ViewFsOverloadScheme}} I checked 
for the latter the FS for trash was initiated with {{ViewFsOverloadScheme}} in 
the delete method. So, if that resolves to some other place through its mount 
table. So, that is also something needs to be checked. Will the Namenode call 
the other NS or something like that.

The defaultFs concern was there in the previous jira itself, so that is worth a 
thought.

Router would also have issues, if the trash path resolves to different NS, or 
to some path which isn't in the Mount Table, when Default Namespace isn't 
configured. 

Regarding the UI. If the trash interval isn't set and If I select NO(move to 
trash), It still deletes with success? Check if the behaviour is like that, The 
client may be in wrong impression that things moved to trash, but it actually 
didn't. We should have bugged him back, Trash isn't enabled.

The API behaviour has changed. Exception/Return value. That needs to be sorted.

Make sure ALL FileSystems behave similar to WebHdfs, unless there is a strong 
reason. (WebHdfs!=FsShell)

And finally find out what was the danger that Daryn Talked about. I remember 
seeing a Jira, where it was quoted "Namenode should never make an RPC to 
itself", The Namenode can hang or something like that, not sure, what was the 
context then. But I couldn't find the jira, so was Daryn's concern related to 
that. Not sure. We can't risk out provided we are thinking to put this in a 
stable release.

Compatibility is to be restored.

How to proceed.
Lets wait for [~weichiu], he is chasing the 3.3.1 release AFAIK. So, he might 
have an answer how much we can wait for this and the answers to the questions 
above. If we get everything sorted, or he already have answers. Things should 
be good. 

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339444#comment-17339444
 ] 

Viraj Jasani commented on HDFS-15982:
-

FYI [~smeng] as we discussed over similar case on PR.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339438#comment-17339438
 ] 

Viraj Jasani edited comment on HDFS-15982 at 5/5/21, 6:30 AM:
--

 
{quote}I was following the older jira.

[~daryn] had a comment:{color:#172b4d} {color}

{color:#172b4d}You cannot or should not create a default fs and parse a path in 
the NN. It's very dangerous. Give me some time (that I don't have) and I'd 
likely come up with a nasty exploit.{color}
{quote}
This seems interesting. Although yes, this is default fs, but it is 
instantiated from WebHdfs config object only (which is used by all endpoints in 
NamenodeWebHdfsMethods). Is WebHdfs server implementation used for any other 
FileSystem (from current and future viewpoints)?
{quote}Router part needs to be checked again and confirmed, Trash itself has 
issues with Router(There are Jiras). So, if Trash becomes true by default, I 
doubt delete through Router don't break or moves to some weird place.
{quote}
Oops, I was not aware of existing concerns with router path resolution within 
Trash. I think this is fair point to make this Jira a compatible change w.r.t 
DELETE REST API calls. Let me provide an addendum for trunk to bring default 
value of skiptrash as true.

branch-3.3 backport PR#2925 is pending. Shall we get it in and then I can 
provide addendum for both trunk and branch-3.3 for clean history?

Btw [~ayushtkn] you might also want to check screenshots attached on this Jira 
to take a look at how skiptrash is handled from Web UI.


was (Author: vjasani):
{quote}I was following the older jira.

[~daryn] had a comment:
{quote}You cannot or should not create a default fs and parse a path in the NN. 
It's very dangerous. Give me some time (that I don't have) and I'd likely come 
up with a nasty exploit.
{quote}{quote}
This seems interesting. Although yes, this is default fs, but it is 
instantiated from WebHdfs config object only (which is used by all endpoints in 
NamenodeWebHdfsMethods). Is WebHdfs server implementation used for any other 
FileSystem (from current and future viewpoints)?
{quote}Router part needs to be checked again and confirmed, Trash itself has 
issues with Router(There are Jiras). So, if Trash becomes true by default, I 
doubt delete through Router don't break or moves to some weird place.
{quote}
Oops, I was not aware of existing concerns with router path resolution with 
Trash. I think this is fair point to make this Jira a compatible change w.r.t 
DELETE REST API calls. Let me provide an addendum for trunk to bring default 
value of skiptrash as true. branch-3.3 backport PR#2925 is pending. Shall we 
get it in and then I can provide addendum for both trunk and branch-3.3 for 
clean history?

Btw [~ayushtkn] you might also want to check screenshots attached on this Jira 
to take a look at how skiptrash is handled from Web UI.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339438#comment-17339438
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}I was following the older jira.

[~daryn] had a comment:
{quote}You cannot or should not create a default fs and parse a path in the NN. 
It's very dangerous. Give me some time (that I don't have) and I'd likely come 
up with a nasty exploit.
{quote}{quote}
This seems interesting. Although yes, this is default fs, but it is 
instantiated from WebHdfs config object only (which is used by all endpoints in 
NamenodeWebHdfsMethods). Is WebHdfs server implementation used for any other 
FileSystem (from current and future viewpoints)?
{quote}Router part needs to be checked again and confirmed, Trash itself has 
issues with Router(There are Jiras). So, if Trash becomes true by default, I 
doubt delete through Router don't break or moves to some weird place.
{quote}
Oops, I was not aware of existing concerns with router path resolution with 
Trash. I think this is fair point to make this Jira a compatible change w.r.t 
DELETE REST API calls. Let me provide an addendum for trunk to bring default 
value of skiptrash as true. branch-3.3 backport PR#2925 is pending. Shall we 
get it in and then I can provide addendum for both trunk and branch-3.3 for 
clean history?

Btw [~ayushtkn] you might also want to check screenshots attached on this Jira 
to take a look at how skiptrash is handled from Web UI.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org