[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending RPC to namenode

2021-12-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461177#comment-17461177
 ] 

Hadoop QA commented on HDFS-15078:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 11s{color} 
| {color:red}{color} | {color:red} HDFS-15078 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15078 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12989407/HDFS-15078.002.patch |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/752/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> RBF: Should check connection channel before sending RPC to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending RPC to namenode

2020-05-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098454#comment-17098454
 ] 

Hadoop QA commented on HDFS-15078:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
24m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
36s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m 40s{color} 
| {color:red} root generated 2 new + 1873 unchanged - 0 fixed = 1875 total (was 
1873) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m  7s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
16s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
|   | hadoop.io.compress.TestCompressorDecompressor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29230/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15078 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12989407/HDFS-15078.002.patch |
| Optional Tests | dupname asflicense compile jav

[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2020-01-09 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012082#comment-17012082
 ] 

Íñigo Goiri commented on HDFS-15078:


Just to summarize, I agree with [~ayushtkn] that the final solution would be to 
modify the connection and carry the caller id, etc.
For now, I suggest that in this JIRA we basically catch this exception and show 
it in a more friendly way and just clean up after it.
Not sure how easy is to catch this ClosedChannelException exception though.

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-25 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003264#comment-17003264
 ] 

Fei Hui commented on HDFS-15078:


[~elgoiri]
{quote}
Can we try to do it as an exception handling instead of proactively checking?
{quote}
Sorry, didn't catch it. Before checking it, everything looks fine. Could you 
please give some ideas?

[~ayushtkn]
{quote}
Router is supposed to just receive the call, and if it has received a valid 
call, it should in any case send to namenode. 
{quote}
If connection between router and client is closed, result could not send to 
client. So maybe sending or not to namennode both are reasonable... Because the 
call failed for client.

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003174#comment-17003174
 ] 

Íñigo Goiri commented on HDFS-15078:


Can we try to do it as an exception handling instead of proactively checking? 

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-25 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003154#comment-17003154
 ] 

Ayush Saxena commented on HDFS-15078:
-

Thanx [~ferhui] for the details. 
TBH I have hard feeling for this fix, and I don't consider this as an 
Improvement too. This finds attention as a bug only for the scenario you told, 
otherwise I don't think there should be anything like router receiving the call 
and not sending to Namenode. Router is supposed to just receive the call, and 
if it has received a valid call, it should in any case send to namenode. For a 
client he is sending the request to the NN itself, Call vanishing in between at 
Router doesn't make sense to me.

I would rather like to fix this problem as whole what HDFS-15079 tends to do, 
rather than just handling one possibility which can minimize the effect. 
Moreover, having checks at router for every call is also an added overhead for 
normal calls. We have lately faced perfomance issues recently too regarding 
calls taking non-trivial amount of time at the Router itself.

Anyway, I am not blocking this in anyway, If others are Ok with this, I pose no 
objections. :)


> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-24 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002739#comment-17002739
 ] 

Fei Hui commented on HDFS-15078:


{quote}
The issue is the first router which c, That client did failover to another 
router, triggered a new call and the second router completed the call, and the 
first call came after this. 
{quote}
Getting EOFException makes client failover to another router. 
And later and the second router completed the call,  the first router the first 
router.

{quote}
If such a case where one Router is delaying, I think without client connection 
crashing still issues like these can come up.
{quote}
Yes. This issue only can resolve the problem on some scenarios. HDFS-15079 
tracks the high level problem.

In our  scenarios. This fix works.


> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002729#comment-17002729
 ] 

Ayush Saxena commented on HDFS-15078:
-

{quote}And overwrite is true by default, this would make the file had been 
written an empty file. This is an critical problem and we had encountered it
{quote}
This wouldn't be solved with your fix too, If the client crashed post the 
check, this scenario will again come, This doesn't seems to be a problem with 
the client crashing and the Router sending the request still to Namenode, The 
issue is the first router which sent the request that late, That client did 
failover to another router, triggered a new call and the second router 
completed the call, and the first call came after this. 

The problem is RBF can't ensure perfect sequential behavior, since there are 
multiple routers, accepting calls, if any one router is slow and others are 
fast, this type of problem can come. If such a case where one Router is 
delaying, I think without client connection crashing still issues like these 
can come up.

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002714#comment-17002714
 ] 

Fei Hui commented on HDFS-15078:


This fix can resolve some
logs as follow
{quote}
2019-12-24 15:46:20,717 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
53 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo f
rom 10.xx.xx.xx:60980 Call#18 Retry#0: java.io.IOException: Connection Channel 
to 10.xx.xx.xx of xxx (auth:SIMPLE) is closed!
2019-12-24 15:46:20,718 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
53 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo f
rom 10.xx.xx.xx:60980 Call#18 Retry#0: output error
2019-12-24 15:46:20,718 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
53 on  caught an exception
java.nio.channels.ClosedChannelException
at 
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2738)
at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
at 
org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1096)
at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1168)
at 
org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2116)
at org.apache.hadoop.ipc.Server$Connection.access$500(Server.java:1236)
at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:638)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2252)
{quote}

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002624#comment-17002624
 ] 

Hadoop QA commented on HDFS-15078:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 15m  7s{color} 
| {color:red} root generated 2 new + 1868 unchanged - 0 fixed = 1870 total (was 
1868) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
52s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
34s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15078 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12989407/HDFS-15078.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 27357e6ed624 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 34ff7db |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28561/artifact/out/diff-compile-javac-root.txt
 

[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002621#comment-17002621
 ] 

Xiaoqiao He commented on HDFS-15078:


{quote}I find there is a critical problem on RBF, this issue can resolve it on 
some Scenarios, but i have no idea about the overall resolution. Plan to file a 
new jira to track it.{quote}
Thanks [~ferhui], this is pretty interesting observation, it seems to a 
critical issue, we should file another JIRA to trace in my opinion. Would you 
like to share some more information, is it batch processing or streaming 
scenario? In our case, all requests from offline batch processing + Adhoc query 
(Spark/MapReduce/Presto), and we do not meet this issue. 

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002603#comment-17002603
 ] 

Fei Hui commented on HDFS-15078:


{quote}
Moreover the case would be a rare scenario and this check would be done on 
every call, this would add unnecessary overhead to all calls.
{quote}
In heavy load cluster, I see lots of output error because of 
java.nio.channels.ClosedChannelException.
There is similar check on namenode,Handler#run
{quote}
connDropped = !call.isOpen();
{quote}

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002586#comment-17002586
 ] 

Fei Hui commented on HDFS-15078:


v002 patch
 change
{code}
if (curCall == null  || !curCall.isOpen()) {
{code}

to
{code}
if (curCall != null  && !curCall.isOpen()) {
{code}

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002571#comment-17002571
 ] 

Fei Hui commented on HDFS-15078:


[~ayushtkn][~elgoiri] I find  there is a critical problem on RBF, this issue 
can resolve it on some Scenarios, but i have no idea about the overall 
resolution. Plan to file a new jira to track it.
The problem is  that
# Client with RBF(r0, r1) create a file HDFS file via r0, it gets Exception and 
failovers to r1
# r0 has been send create rpc to namenode(1st create)
# Client create a HDFS file via r1(2nd create)
# Client writes the HDFS file and close it finally(3rd close)

Maybe namenode receiving the rpc in order as follow
# 2nd create
# 3rd close
# 1st create

And overwrite is true by default, this would make the file had been written an 
empty file. This is an critical problem and we had encountered it

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002486#comment-17002486
 ] 

Íñigo Goiri commented on HDFS-15078:


I think that the failed unit tests backed what [~ayushtkn] was saying. 
I have to check but I think we get this in our deployment. 
It would be nice to handle it properly when it happens. 

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002392#comment-17002392
 ] 

Hadoop QA commented on HDFS-15078:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 56s{color} 
| {color:red} root generated 2 new + 1868 unchanged - 0 fixed = 1870 total (was 
1868) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 34s{color} | {color:orange} root: The patch generated 1 new + 203 unchanged 
- 0 fixed = 204 total (was 203) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 27s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 22s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterMissingFolderMulti |
|   | hadoop.hdfs.server.federation.router.TestRouterQuota |
|   | hadoop.hdfs.server.federation.router.TestRouterMountTable |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory |
|   | hadoop.hdfs.server.federation.router.TestRouterRpc |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractRename |
|   | hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractOpen |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractMkdir |
|   | hadoop.hdfs.server.federation.router.TestSafeMode |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractAppen

[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002383#comment-17002383
 ] 

Ayush Saxena commented on HDFS-15078:
-

Thanx [~ferhui] for the patch.
I am not sure regarding this, if the client has triggered the request, I think 
it should go to the namenode, though it crashed after sending the request. 
Since he has successfully sent the request to the server,
 Check whether the behavior is same as if he sends a request to the namenode 
and crashes before getting the response. Moreover the case would be a rare 
scenario and this check would be done on every call, this would add unnecessary 
overhead to all calls. 

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15078) RBF: Should check connection channel before sending rpc to namenode

2019-12-23 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002374#comment-17002374
 ] 

Fei Hui commented on HDFS-15078:


[~ayushtkn] [~elgoiri] Could you please take a look? Thanks

> RBF: Should check connection channel before sending rpc to namenode
> ---
>
> Key: HDFS-15078
> URL: https://issues.apache.org/jira/browse/HDFS-15078
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-15078.001.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on , call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on  caught an exception
> java.nio.channels.ClosedChannelException
> at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
> at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
> at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
> at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
> at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
> at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
> at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
> at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org