Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
mridulm commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595887602 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: `synchronized` will throw an NPE if called on `null` - and `handler` is a local variable at this point, so the `null` check should be fine. This should be a fairly rare occurrence - if I am not wrong, this is due to socket close between the `isActive` call and the subsequent `pipeline().get(TransportChannelHandler.class)` call - but agree with @dongjoon-hyun, we have to make sure the exception is at the `sychronized` statement itself : which version was this observed against ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
mridulm commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595887602 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: `synchronized` will throw an NPE if called on `null` - and `handler` is a local variable at this point, so the `null` check should be fine. This should be a fairly rare occurrence - if I am not wrong, this is due to socket close between the `isActive` call and the subsequent `pipeline().get(TransportChannelHandler.class)` call - but agree with @dongjoon-hyun, we have to make sure the exception is at the `sychronized` statement itself : which version was this observed against ? I am fine with the change, assuming @dongjoon-hyun does not have concerns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
mridulm commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595887602 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: `synchronized` will throw an NPE if called on `null` - and `handler` is a local variable at this point, so the `null` check should be fine. This should be a fairly rare occurrence - if I am not wrong, this is due to socket close between the `isActive` call and the subsequent `pipeline().get(TransportChannelHandler.class)` call - but agree with @dongjoon-hyun, we have to make sure the exception is at the `sychronized` call itself. I am fine with the change, assuming @dongjoon-hyun does not have concerns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
mridulm commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595887602 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: `synchronized` will throw an NPE if called on `null` - and `handler` is a local variable at this point, so the `null` check should be fine. This should be a fairly rare occurrence - if I am not wrong, this is due to socket close between the `isActive` call and the subsequent `pipeline().get(TransportChannelHandler.class)` call. I am fine with the change, assuming @dongjoon-hyun does not have concerns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
mridulm commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595887602 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: `synchronized` will throw an NPE if called on `null` - and `handler` is a local variable at this point, so the `null` check should be fine. This should be a fairly rare occurrence - if I am not wrong, this is due to socket closer between the `isActive` call and the subsequent `pipeline().get(TransportChannelHandler.class)` call ? I am fine with the change, assuming @dongjoon-hyun does not have concerns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
dongjoon-hyun commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595629230 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: Please let me know if you are observing `synchronized(handler)` line is the root cause of NPE. I assumed that NPE happens at `handler.getRes...`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
dongjoon-hyun commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595628352 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { -handler.getResponseHandler().updateTimeOfLastRequest(); + if (handler != null) { Review Comment: I'm not sure if this is a safe replacement. Although I don't know your code, if we need to check nullability, shall we do inside `synchronized` block? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48218][SHUFFLE] TransportClientFactory.createClient may NPE cause FetchFailedException [spark]
dongjoon-hyun commented on code in PR #46506: URL: https://github.com/apache/spark/pull/46506#discussion_r1595626768 ## common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java: ## @@ -169,8 +169,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f // this code was able to update things. TransportChannelHandler handler = cachedClient.getChannel().pipeline() .get(TransportChannelHandler.class); - synchronized (handler) { Review Comment: Your error message doesn't match with code. Maybe, it came from your own fork? ``` Caused by: java.lang.NullPointerException at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:178) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org