[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2023-11-07 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783867#comment-17783867
 ] 

Duo Zhang commented on HADOOP-15327:


So do we have any plan on backporting this to branch-3.3? Netty 3 has a bunch 
of CVEs...

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636482#comment-17636482
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

aajisaka closed pull request #3259: HADOOP-15327. Upgrade MR ShuffleHandler to 
use Netty4
URL: https://github.com/apache/hadoop/pull/3259




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636481#comment-17636481
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

aajisaka commented on PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1321546109

   Closing this PR as it's already merged into trunk. Thank you.




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636076#comment-17636076
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

jojochuang commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1026831834


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -18,19 +18,20 @@
 
 package org.apache.hadoop.mapred;
 
+import static io.netty.buffer.Unpooled.wrappedBuffer;
+import static io.netty.handler.codec.http.HttpHeaderNames.CONTENT_TYPE;
+import static io.netty.handler.codec.http.HttpMethod.GET;
+import static io.netty.handler.codec.http.HttpResponseStatus.BAD_REQUEST;
+import static io.netty.handler.codec.http.HttpResponseStatus.FORBIDDEN;
+import static 
io.netty.handler.codec.http.HttpResponseStatus.INTERNAL_SERVER_ERROR;
+import static 
io.netty.handler.codec.http.HttpResponseStatus.METHOD_NOT_ALLOWED;
+import static io.netty.handler.codec.http.HttpResponseStatus.NOT_FOUND;
+import static io.netty.handler.codec.http.HttpResponseStatus.OK;
+import static io.netty.handler.codec.http.HttpResponseStatus.UNAUTHORIZED;
+import static io.netty.handler.codec.http.HttpVersion.HTTP_1_1;
+import static org.apache.hadoop.mapred.ShuffleHandler.NettyChannelHelper.*;

Review Comment:
   let's not use wildcard import



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java:
##
@@ -315,9 +318,8 @@ protected void copyFromHost(MapHost host) throws 
IOException {
   return;
 }
 
-if(LOG.isDebugEnabled()) {
-  LOG.debug("Fetcher " + id + " going to fetch from " + host + " for: "
-+ maps);
+if (LOG.isDebugEnabled()) {
+  LOG.debug("Fetcher " + id + " going to fetch from " + host + " for: " + 
maps);

Review Comment:
   slf4j logger messages can be rewritten using parameterized logging format. 
But let's not worry about that now. This PR is already too big.



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -1322,48 +1438,45 @@ protected void sendError(ChannelHandlerContext ctx, 
String msg,
   for (Map.Entry header : headers.entrySet()) {
 response.headers().set(header.getKey(), header.getValue());
   }
-  response.setContent(
-  ChannelBuffers.copiedBuffer(msg, CharsetUtil.UTF_8));
 
   // Close the connection as soon as the error message is sent.
-  
ctx.getChannel().write(response).addListener(ChannelFutureListener.CLOSE);
+  writeToChannelAndClose(ctx.channel(), response);
 }
 
 @Override
-public void exceptionCaught(ChannelHandlerContext ctx, ExceptionEvent e)
+public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause)
 throws Exception {
-  Channel ch = e.getChannel();
-  Throwable cause = e.getCause();
+  Channel ch = ctx.channel();
   if (cause instanceof TooLongFrameException) {
+LOG.trace("TooLongFrameException, channel id: {}", ch.id());
 sendError(ctx, BAD_REQUEST);
 return;
   } else if (cause instanceof IOException) {
 if (cause instanceof ClosedChannelException) {
-  LOG.debug("Ignoring closed channel error", cause);
+  LOG.debug("Ignoring closed channel error, channel id: " + ch.id(), 
cause);

Review Comment:
   Use parameterized logging, otherwise wrap it in a LOG.isDebugEnabled() check.



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -904,65 +990,84 @@ private List splitMaps(List mapq) {
 }
 
 @Override
-public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) 
+public void channelActive(ChannelHandlerContext ctx)
 throws Exception {
-  super.channelOpen(ctx, evt);
-
-  if ((maxShuffleConnections > 0) && (accepted.size() >= 
maxShuffleConnections)) {
+  NettyChannelHelper.channelActive(ctx.channel());
+  int numConnections = activeConnections.incrementAndGet();
+  if ((maxShuffleConnections > 0) && (numConnections > 
maxShuffleConnections)) {

Review Comment:
   Perhaps these channel bookeeping is no longer needed given that channel size 
is limited.



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService {
 
   public static final HttpResponseStatus TOO_MANY_REQ_STATUS =
   new HttpResponseStatus(429, "TOO MANY REQUESTS");
-  // This should kept in sync with 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17632150#comment-17632150
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

brumi1024 commented on PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1311363399

   Merged to trunk. Thanks @szilard-nemeth for the patch and @shuzirra @9uapaw 
@K0K0V0K for the review!




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631957#comment-17631957
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

hadoop-yetus commented on PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1311003083

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  4s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 38s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  21m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 18s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 52s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 12s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +0 :ok: |  spotbugs  |   0m 55s |  |  
branch/hadoop-client-modules/hadoop-client-runtime no spotbugs output file 
(spotbugsXml.xml)  |
   | +1 :green_heart: |  shadedclient  |  21m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   6m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  24m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m  4s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m  4s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   4m  1s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 21 new + 82 unchanged - 62 fixed = 103 total (was 
144)  |
   | +1 :green_heart: |  mvnsite  |   7m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   5m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   5m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  spotbugs  |   4m 45s | 
[/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html)
 |  hadoop-mapreduce-project/hadoop-mapreduce-client generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | -1 :x: |  spotbugs  |   1m 44s | 
[/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html)
 |  
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   | +0 :ok: |  spotbugs  |   0m 56s |  |  
hadoop-client-modules/hadoop-client-runtime has no data from spotbugs  |
   | +1 :green_heart: |  shadedclient  |  21m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 166m 16s |  |  hadoop-mapreduce-client in the 
patch passed.  |
   | +1 :green_heart: |  unit  |   8m 10s |  |  hadoop-mapreduce-client-core in 
the patch 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631652#comment-17631652
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019105924


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/resources/log4j.properties:
##
@@ -17,3 +17,5 @@ log4j.threshold=ALL
 log4j.appender.stdout=org.apache.log4j.ConsoleAppender
 log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
 log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p [%t] %c{2} 
(%F:%M(%L)) - %m%n
+log4j.logger.io.netty=DEBUG
+log4j.logger.org.apache.hadoop.mapred=DEBUG

Review Comment:
   Thanks, changed the level to INFO.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631651#comment-17631651
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019104663


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java:
##
@@ -668,34 +1357,61 @@ protected ChannelFuture 
sendMapOutput(ChannelHandlerContext ctx,
   conns[i].connect();
 }
 
-//Ensure first connections are okay
-conns[0].getInputStream();
-int rc = conns[0].getResponseCode();
-Assert.assertEquals(HttpURLConnection.HTTP_OK, rc);
-
-conns[1].getInputStream();
-rc = conns[1].getResponseCode();
-Assert.assertEquals(HttpURLConnection.HTTP_OK, rc);
-
-// This connection should be closed because it to above the limit
-try {
-  rc = conns[2].getResponseCode();
-  Assert.assertEquals("Expected a too-many-requests response code",
-  ShuffleHandler.TOO_MANY_REQ_STATUS.getCode(), rc);
-  long backoff = Long.valueOf(
-  conns[2].getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER));
-  Assert.assertTrue("The backoff value cannot be negative.", backoff > 0);
-  conns[2].getInputStream();
-  Assert.fail("Expected an IOException");
-} catch (IOException ioe) {
-  LOG.info("Expected - connection should not be open");
-} catch (NumberFormatException ne) {
-  Assert.fail("Expected a numerical value for RETRY_AFTER header field");
-} catch (Exception e) {
-  Assert.fail("Expected a IOException");
+Map> mapOfConnections = Maps.newHashMap();
+for (HttpURLConnection conn : conns) {
+  try {
+conn.getInputStream();
+  } catch (IOException ioe) {
+LOG.info("Expected - connection should not be open");
+  } catch (NumberFormatException ne) {
+fail("Expected a numerical value for RETRY_AFTER header field");
+  } catch (Exception e) {
+fail("Expected a IOException");
+  }
+  int statusCode = conn.getResponseCode();
+  LOG.debug("Connection status code: {}", statusCode);
+  mapOfConnections.putIfAbsent(statusCode, new ArrayList<>());
+  List connectionList = 
mapOfConnections.get(statusCode);
+  connectionList.add(conn);
 }
+
+assertEquals(String.format("Expected only %s and %s response",
+OK_STATUS, ShuffleHandler.TOO_MANY_REQ_STATUS),
+Sets.newHashSet(
+HttpURLConnection.HTTP_OK,
+ShuffleHandler.TOO_MANY_REQ_STATUS.code()),
+mapOfConnections.keySet());
 
-shuffleHandler.stop(); 
+List successfulConnections =
+mapOfConnections.get(HttpURLConnection.HTTP_OK);
+assertEquals(String.format("Expected exactly %d requests " +
+"with %s response", maxAllowedConnections, OK_STATUS),
+maxAllowedConnections, successfulConnections.size());
+
+//Ensure exactly one connection is HTTP 429 (TOO MANY REQUESTS)
+List closedConnections =
+mapOfConnections.get(ShuffleHandler.TOO_MANY_REQ_STATUS.code());
+assertEquals(String.format("Expected exactly %d %s response",
+notAcceptedConnections, ShuffleHandler.TOO_MANY_REQ_STATUS),
+notAcceptedConnections, closedConnections.size());
+
+// This connection should be closed because it is above the maximum limit
+HttpURLConnection conn = closedConnections.get(0);
+assertEquals(String.format("Expected a %s response",
+ShuffleHandler.TOO_MANY_REQ_STATUS),
+ShuffleHandler.TOO_MANY_REQ_STATUS.code(), conn.getResponseCode());
+long backoff = Long.parseLong(
+conn.getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER));
+assertTrue("The backoff value cannot be negative.", backoff > 0);
+
+shuffleHandler.stop();
+
+//It's okay to get a ClosedChannelException.
+//All other kinds of exceptions means something went wrong
+assertEquals("Should have no caught exceptions",
+Collections.emptyList(), failures.stream()
+.filter(f -> !(f instanceof ClosedChannelException))
+.collect(toList()));

Review Comment:
   Thanks for this comment. It's indeed a wrong thing that the InputStream is 
not closed.
   I'd suggest to create a separate jira to refactor this class, as it's 
currently in a horrible shape.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631643#comment-17631643
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019075009


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -904,65 +990,84 @@ private List splitMaps(List mapq) {
 }
 
 @Override
-public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) 
+public void channelActive(ChannelHandlerContext ctx)
 throws Exception {
-  super.channelOpen(ctx, evt);
-
-  if ((maxShuffleConnections > 0) && (accepted.size() >= 
maxShuffleConnections)) {
+  NettyChannelHelper.channelActive(ctx.channel());
+  int numConnections = activeConnections.incrementAndGet();
+  if ((maxShuffleConnections > 0) && (numConnections > 
maxShuffleConnections)) {
 LOG.info(String.format("Current number of shuffle connections (%d) is 
" + 
-"greater than or equal to the max allowed shuffle connections 
(%d)", 
+"greater than the max allowed shuffle connections (%d)",
 accepted.size(), maxShuffleConnections));
 
-Map headers = new HashMap(1);
+Map headers = new HashMap<>(1);
 // notify fetchers to backoff for a while before closing the connection
 // if the shuffle connection limit is hit. Fetchers are expected to
 // handle this notification gracefully, that is, not treating this as a
 // fetch failure.
 headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY));
 sendError(ctx, "", TOO_MANY_REQ_STATUS, headers);
-return;
+  } else {
+super.channelActive(ctx);
+accepted.add(ctx.channel());
+LOG.debug("Added channel: {}, channel id: {}. Accepted number of 
connections={}",
+ctx.channel(), ctx.channel().id(), activeConnections.get());
   }
-  accepted.add(evt.getChannel());
 }
 
 @Override
-public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt)
+public void channelInactive(ChannelHandlerContext ctx) throws Exception {
+  NettyChannelHelper.channelInactive(ctx.channel());
+  super.channelInactive(ctx);
+  int noOfConnections = activeConnections.decrementAndGet();
+  LOG.debug("New value of Accepted number of connections={}", 
noOfConnections);
+}
+
+@Override
+public void channelRead(ChannelHandlerContext ctx, Object msg)
 throws Exception {
-  HttpRequest request = (HttpRequest) evt.getMessage();
-  if (request.getMethod() != GET) {
-  sendError(ctx, METHOD_NOT_ALLOWED);
-  return;
+  Channel channel = ctx.channel();
+  LOG.trace("Executing channelRead, channel id: {}", channel.id());
+  HttpRequest request = (HttpRequest) msg;
+  LOG.debug("Received HTTP request: {}, channel id: {}", request, 
channel.id());
+  if (request.method() != GET) {
+sendError(ctx, METHOD_NOT_ALLOWED);
+return;
   }
   // Check whether the shuffle version is compatible
-  if (!ShuffleHeader.DEFAULT_HTTP_HEADER_NAME.equals(
-  request.headers() != null ?
-  request.headers().get(ShuffleHeader.HTTP_HEADER_NAME) : null)
-  || !ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION.equals(
-  request.headers() != null ?
-  request.headers()
-  .get(ShuffleHeader.HTTP_HEADER_VERSION) : null)) {
+  String shuffleVersion = ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION;
+  String httpHeaderName = ShuffleHeader.HTTP_HEADER_NAME;

Review Comment:
   Good catch, fixed. please check the new code.
   I agree with @brumi1024 , the request.headers null check is needed.
   
   Double-checked the original code: 
   It sent error with "Incompatible shuffle request version" message either if: 
   1. no request.headers() was present (equals to null)
   2. shuffleversion is not equal to ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION
   3. header name is not equal to ShuffleHeader.HTTP_HEADER_NAME





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631630#comment-17631630
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019051278


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -291,36 +302,86 @@ public void operationComplete(ChannelFuture future) 
throws Exception {
 }
   }
 
+  static class NettyChannelHelper {
+static ChannelFuture writeToChannel(Channel ch, Object obj) {
+  LOG.debug("Writing {} to channel: {}", obj.getClass().getSimpleName(), 
ch.id());
+  return ch.writeAndFlush(obj);
+}
+
+static ChannelFuture writeToChannelAndClose(Channel ch, Object obj) {
+  return writeToChannel(ch, obj).addListener(ChannelFutureListener.CLOSE);
+}
+
+static ChannelFuture writeToChannelAndAddLastHttpContent(Channel ch, 
HttpResponse obj) {
+  writeToChannel(ch, obj);
+  return writeLastHttpContentToChannel(ch);
+}
+
+static ChannelFuture writeLastHttpContentToChannel(Channel ch) {
+  LOG.debug("Writing LastHttpContent, channel id: {}", ch.id());
+  return ch.writeAndFlush(LastHttpContent.EMPTY_LAST_CONTENT);
+}
+
+static ChannelFuture closeChannel(Channel ch) {
+  LOG.debug("Closing channel, channel id: {}", ch.id());
+  return ch.close();
+}
+
+static void closeChannels(ChannelGroup channelGroup) {
+  channelGroup.close().awaitUninterruptibly(10, TimeUnit.SECONDS);
+}
+
+public static ChannelFuture closeAsIdle(Channel channel, int timeout) {
+  LOG.debug("Closing channel as writer was idle for {} seconds", timeout);
+  return closeChannel(channel);
+}
+
+public static void channelActive(Channel ch) {
+  LOG.debug("Executing channelActive, channel id: {}", ch.id());
+}
+
+public static void channelInactive(Channel channel) {
+  LOG.debug("Executing channelInactive, channel id: {}", channel.id());
+}

Review Comment:
   This is just using the same name as framework method called 
'org.apache.hadoop.mapred.ShuffleHandler.Shuffle#channelActive / 
io.netty.channel.ChannelInboundHandlerAdapter#channelActive'.
   So I would keep the name as it is.
   channel renamed to ch as you suggested.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631628#comment-17631628
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019049338


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService {
 
   public static final HttpResponseStatus TOO_MANY_REQ_STATUS =
   new HttpResponseStatus(429, "TOO MANY REQUESTS");
-  // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
+  // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
   public static final long FETCH_RETRY_DELAY = 1000L;
   public static final String RETRY_AFTER_HEADER = "Retry-After";
+  static final String ENCODER_HANDLER_NAME = "encoder";
 
   private int port;
-  private ChannelFactory selector;
-  private final ChannelGroup accepted = new DefaultChannelGroup();
+  private EventLoopGroup bossGroup;
+  private EventLoopGroup workerGroup;
+  private ServerBootstrap bootstrap;
+  private Channel ch;
+  private final ChannelGroup accepted =
+  new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next());

Review Comment:
   TBH, I don't know what's the right number for this as the previous 
implementation hidden the # of threads from the clients.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631627#comment-17631627
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019048887


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService {
 
   public static final HttpResponseStatus TOO_MANY_REQ_STATUS =
   new HttpResponseStatus(429, "TOO MANY REQUESTS");
-  // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
+  // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
   public static final long FETCH_RETRY_DELAY = 1000L;
   public static final String RETRY_AFTER_HEADER = "Retry-After";
+  static final String ENCODER_HANDLER_NAME = "encoder";
 
   private int port;
-  private ChannelFactory selector;
-  private final ChannelGroup accepted = new DefaultChannelGroup();
+  private EventLoopGroup bossGroup;
+  private EventLoopGroup workerGroup;
+  private ServerBootstrap bootstrap;
+  private Channel ch;
+  private final ChannelGroup accepted =
+  new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next());

Review Comment:
   Yep, this is the number of threads, it can be checked easily with the 
parameters. I don't think there's any value to extract it to a constant as it's 
only being used once here.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631626#comment-17631626
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019043683


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java:
##
@@ -72,19 +73,21 @@
   private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After";
 
   protected final Reporter reporter;
-  private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
+  @VisibleForTesting
+  public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
 CONNECTION, WRONG_REDUCE}
-  
-  private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";
+
+  @VisibleForTesting
+  public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";

Review Comment:
   actually this is inconsistent in the whole class so I'd refrain to change 
the order of modifiers as I only changed private to public here.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631248#comment-17631248
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

9uapaw commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018349426


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java:
##
@@ -72,19 +73,21 @@
   private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After";
 
   protected final Reporter reporter;
-  private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
+  @VisibleForTesting
+  public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
 CONNECTION, WRONG_REDUCE}
-  
-  private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";
+
+  @VisibleForTesting
+  public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";

Review Comment:
   I think lets keep it, it is not important.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631246#comment-17631246
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

K0K0V0K commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018348504


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java:
##
@@ -668,34 +1357,61 @@ protected ChannelFuture 
sendMapOutput(ChannelHandlerContext ctx,
   conns[i].connect();
 }
 
-//Ensure first connections are okay
-conns[0].getInputStream();
-int rc = conns[0].getResponseCode();
-Assert.assertEquals(HttpURLConnection.HTTP_OK, rc);
-
-conns[1].getInputStream();
-rc = conns[1].getResponseCode();
-Assert.assertEquals(HttpURLConnection.HTTP_OK, rc);
-
-// This connection should be closed because it to above the limit
-try {
-  rc = conns[2].getResponseCode();
-  Assert.assertEquals("Expected a too-many-requests response code",
-  ShuffleHandler.TOO_MANY_REQ_STATUS.getCode(), rc);
-  long backoff = Long.valueOf(
-  conns[2].getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER));
-  Assert.assertTrue("The backoff value cannot be negative.", backoff > 0);
-  conns[2].getInputStream();
-  Assert.fail("Expected an IOException");
-} catch (IOException ioe) {
-  LOG.info("Expected - connection should not be open");
-} catch (NumberFormatException ne) {
-  Assert.fail("Expected a numerical value for RETRY_AFTER header field");
-} catch (Exception e) {
-  Assert.fail("Expected a IOException");
+Map> mapOfConnections = Maps.newHashMap();
+for (HttpURLConnection conn : conns) {
+  try {
+conn.getInputStream();
+  } catch (IOException ioe) {
+LOG.info("Expected - connection should not be open");
+  } catch (NumberFormatException ne) {
+fail("Expected a numerical value for RETRY_AFTER header field");
+  } catch (Exception e) {
+fail("Expected a IOException");
+  }
+  int statusCode = conn.getResponseCode();
+  LOG.debug("Connection status code: {}", statusCode);
+  mapOfConnections.putIfAbsent(statusCode, new ArrayList<>());
+  List connectionList = 
mapOfConnections.get(statusCode);
+  connectionList.add(conn);
 }
+
+assertEquals(String.format("Expected only %s and %s response",
+OK_STATUS, ShuffleHandler.TOO_MANY_REQ_STATUS),
+Sets.newHashSet(
+HttpURLConnection.HTTP_OK,
+ShuffleHandler.TOO_MANY_REQ_STATUS.code()),
+mapOfConnections.keySet());
 
-shuffleHandler.stop(); 
+List successfulConnections =
+mapOfConnections.get(HttpURLConnection.HTTP_OK);
+assertEquals(String.format("Expected exactly %d requests " +
+"with %s response", maxAllowedConnections, OK_STATUS),
+maxAllowedConnections, successfulConnections.size());
+
+//Ensure exactly one connection is HTTP 429 (TOO MANY REQUESTS)
+List closedConnections =
+mapOfConnections.get(ShuffleHandler.TOO_MANY_REQ_STATUS.code());
+assertEquals(String.format("Expected exactly %d %s response",
+notAcceptedConnections, ShuffleHandler.TOO_MANY_REQ_STATUS),
+notAcceptedConnections, closedConnections.size());
+
+// This connection should be closed because it is above the maximum limit
+HttpURLConnection conn = closedConnections.get(0);
+assertEquals(String.format("Expected a %s response",
+ShuffleHandler.TOO_MANY_REQ_STATUS),
+ShuffleHandler.TOO_MANY_REQ_STATUS.code(), conn.getResponseCode());
+long backoff = Long.parseLong(
+conn.getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER));
+assertTrue("The backoff value cannot be negative.", backoff > 0);
+
+shuffleHandler.stop();
+
+//It's okay to get a ClosedChannelException.
+//All other kinds of exceptions means something went wrong
+assertEquals("Should have no caught exceptions",
+Collections.emptyList(), failures.stream()
+.filter(f -> !(f instanceof ClosedChannelException))
+.collect(toList()));

Review Comment:
   So on second try:
   
   Maybe here we can call a close method on the elements of the http connection 
lists,
   cause some of them will be still open even if this test is finished





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631247#comment-17631247
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

9uapaw commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018349080


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService {
 
   public static final HttpResponseStatus TOO_MANY_REQ_STATUS =
   new HttpResponseStatus(429, "TOO MANY REQUESTS");
-  // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
+  // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
   public static final long FETCH_RETRY_DELAY = 1000L;
   public static final String RETRY_AFTER_HEADER = "Retry-After";
+  static final String ENCODER_HANDLER_NAME = "encoder";
 
   private int port;
-  private ChannelFactory selector;
-  private final ChannelGroup accepted = new DefaultChannelGroup();
+  private EventLoopGroup bossGroup;
+  private EventLoopGroup workerGroup;
+  private ServerBootstrap bootstrap;
+  private Channel ch;
+  private final ChannelGroup accepted =
+  new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next());

Review Comment:
   This is just the number of threads the executor will have. Though it might 
be extracted for clear intentions.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631245#comment-17631245
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

9uapaw commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018347131


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -291,36 +302,86 @@ public void operationComplete(ChannelFuture future) 
throws Exception {
 }
   }
 
+  static class NettyChannelHelper {
+static ChannelFuture writeToChannel(Channel ch, Object obj) {
+  LOG.debug("Writing {} to channel: {}", obj.getClass().getSimpleName(), 
ch.id());
+  return ch.writeAndFlush(obj);
+}
+
+static ChannelFuture writeToChannelAndClose(Channel ch, Object obj) {
+  return writeToChannel(ch, obj).addListener(ChannelFutureListener.CLOSE);
+}
+
+static ChannelFuture writeToChannelAndAddLastHttpContent(Channel ch, 
HttpResponse obj) {
+  writeToChannel(ch, obj);
+  return writeLastHttpContentToChannel(ch);
+}
+
+static ChannelFuture writeLastHttpContentToChannel(Channel ch) {
+  LOG.debug("Writing LastHttpContent, channel id: {}", ch.id());
+  return ch.writeAndFlush(LastHttpContent.EMPTY_LAST_CONTENT);
+}
+
+static ChannelFuture closeChannel(Channel ch) {
+  LOG.debug("Closing channel, channel id: {}", ch.id());
+  return ch.close();
+}
+
+static void closeChannels(ChannelGroup channelGroup) {
+  channelGroup.close().awaitUninterruptibly(10, TimeUnit.SECONDS);
+}
+
+public static ChannelFuture closeAsIdle(Channel channel, int timeout) {
+  LOG.debug("Closing channel as writer was idle for {} seconds", timeout);
+  return closeChannel(channel);
+}
+
+public static void channelActive(Channel ch) {
+  LOG.debug("Executing channelActive, channel id: {}", ch.id());
+}
+
+public static void channelInactive(Channel channel) {
+  LOG.debug("Executing channelInactive, channel id: {}", channel.id());
+}

Review Comment:
   We need an other patch anyway, so we might as well fix these to be more 
consistent.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631244#comment-17631244
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

9uapaw commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018346414


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -904,65 +990,84 @@ private List splitMaps(List mapq) {
 }
 
 @Override
-public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) 
+public void channelActive(ChannelHandlerContext ctx)
 throws Exception {
-  super.channelOpen(ctx, evt);
-
-  if ((maxShuffleConnections > 0) && (accepted.size() >= 
maxShuffleConnections)) {
+  NettyChannelHelper.channelActive(ctx.channel());
+  int numConnections = activeConnections.incrementAndGet();
+  if ((maxShuffleConnections > 0) && (numConnections > 
maxShuffleConnections)) {
 LOG.info(String.format("Current number of shuffle connections (%d) is 
" + 
-"greater than or equal to the max allowed shuffle connections 
(%d)", 
+"greater than the max allowed shuffle connections (%d)",
 accepted.size(), maxShuffleConnections));
 
-Map headers = new HashMap(1);
+Map headers = new HashMap<>(1);
 // notify fetchers to backoff for a while before closing the connection
 // if the shuffle connection limit is hit. Fetchers are expected to
 // handle this notification gracefully, that is, not treating this as a
 // fetch failure.
 headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY));
 sendError(ctx, "", TOO_MANY_REQ_STATUS, headers);
-return;
+  } else {
+super.channelActive(ctx);
+accepted.add(ctx.channel());
+LOG.debug("Added channel: {}, channel id: {}. Accepted number of 
connections={}",
+ctx.channel(), ctx.channel().id(), activeConnections.get());
   }
-  accepted.add(evt.getChannel());
 }
 
 @Override
-public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt)
+public void channelInactive(ChannelHandlerContext ctx) throws Exception {
+  NettyChannelHelper.channelInactive(ctx.channel());
+  super.channelInactive(ctx);
+  int noOfConnections = activeConnections.decrementAndGet();
+  LOG.debug("New value of Accepted number of connections={}", 
noOfConnections);
+}
+
+@Override
+public void channelRead(ChannelHandlerContext ctx, Object msg)
 throws Exception {
-  HttpRequest request = (HttpRequest) evt.getMessage();
-  if (request.getMethod() != GET) {
-  sendError(ctx, METHOD_NOT_ALLOWED);
-  return;
+  Channel channel = ctx.channel();
+  LOG.trace("Executing channelRead, channel id: {}", channel.id());
+  HttpRequest request = (HttpRequest) msg;
+  LOG.debug("Received HTTP request: {}, channel id: {}", request, 
channel.id());
+  if (request.method() != GET) {
+sendError(ctx, METHOD_NOT_ALLOWED);
+return;
   }
   // Check whether the shuffle version is compatible
-  if (!ShuffleHeader.DEFAULT_HTTP_HEADER_NAME.equals(
-  request.headers() != null ?
-  request.headers().get(ShuffleHeader.HTTP_HEADER_NAME) : null)
-  || !ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION.equals(
-  request.headers() != null ?
-  request.headers()
-  .get(ShuffleHeader.HTTP_HEADER_VERSION) : null)) {
+  String shuffleVersion = ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION;
+  String httpHeaderName = ShuffleHeader.HTTP_HEADER_NAME;

Review Comment:
   This is a valid concern, please fix it.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631229#comment-17631229
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

9uapaw commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018309405


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/resources/log4j.properties:
##
@@ -17,3 +17,5 @@ log4j.threshold=ALL
 log4j.appender.stdout=org.apache.log4j.ConsoleAppender
 log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
 log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p [%t] %c{2} 
(%F:%M(%L)) - %m%n
+log4j.logger.io.netty=DEBUG
+log4j.logger.org.apache.hadoop.mapred=DEBUG

Review Comment:
   +1 this is left here accidentally.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631196#comment-17631196
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

brumi1024 commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018265618


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -904,65 +990,84 @@ private List splitMaps(List mapq) {
 }
 
 @Override
-public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) 
+public void channelActive(ChannelHandlerContext ctx)
 throws Exception {
-  super.channelOpen(ctx, evt);
-
-  if ((maxShuffleConnections > 0) && (accepted.size() >= 
maxShuffleConnections)) {
+  NettyChannelHelper.channelActive(ctx.channel());
+  int numConnections = activeConnections.incrementAndGet();
+  if ((maxShuffleConnections > 0) && (numConnections > 
maxShuffleConnections)) {
 LOG.info(String.format("Current number of shuffle connections (%d) is 
" + 
-"greater than or equal to the max allowed shuffle connections 
(%d)", 
+"greater than the max allowed shuffle connections (%d)",
 accepted.size(), maxShuffleConnections));
 
-Map headers = new HashMap(1);
+Map headers = new HashMap<>(1);
 // notify fetchers to backoff for a while before closing the connection
 // if the shuffle connection limit is hit. Fetchers are expected to
 // handle this notification gracefully, that is, not treating this as a
 // fetch failure.
 headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY));
 sendError(ctx, "", TOO_MANY_REQ_STATUS, headers);
-return;
+  } else {
+super.channelActive(ctx);
+accepted.add(ctx.channel());
+LOG.debug("Added channel: {}, channel id: {}. Accepted number of 
connections={}",
+ctx.channel(), ctx.channel().id(), activeConnections.get());
   }
-  accepted.add(evt.getChannel());
 }
 
 @Override
-public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt)
+public void channelInactive(ChannelHandlerContext ctx) throws Exception {
+  NettyChannelHelper.channelInactive(ctx.channel());
+  super.channelInactive(ctx);
+  int noOfConnections = activeConnections.decrementAndGet();
+  LOG.debug("New value of Accepted number of connections={}", 
noOfConnections);
+}
+
+@Override
+public void channelRead(ChannelHandlerContext ctx, Object msg)
 throws Exception {
-  HttpRequest request = (HttpRequest) evt.getMessage();
-  if (request.getMethod() != GET) {
-  sendError(ctx, METHOD_NOT_ALLOWED);
-  return;
+  Channel channel = ctx.channel();
+  LOG.trace("Executing channelRead, channel id: {}", channel.id());
+  HttpRequest request = (HttpRequest) msg;
+  LOG.debug("Received HTTP request: {}, channel id: {}", request, 
channel.id());
+  if (request.method() != GET) {
+sendError(ctx, METHOD_NOT_ALLOWED);
+return;
   }
   // Check whether the shuffle version is compatible
-  if (!ShuffleHeader.DEFAULT_HTTP_HEADER_NAME.equals(
-  request.headers() != null ?
-  request.headers().get(ShuffleHeader.HTTP_HEADER_NAME) : null)
-  || !ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION.equals(
-  request.headers() != null ?
-  request.headers()
-  .get(ShuffleHeader.HTTP_HEADER_VERSION) : null)) {
+  String shuffleVersion = ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION;
+  String httpHeaderName = ShuffleHeader.HTTP_HEADER_NAME;

Review Comment:
   +1 on the DEFAULT, however the if is needed because if the request.headers() 
returns a non-null value we need to check the version.



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java:
##
@@ -72,19 +73,21 @@
   private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After";
 
   protected final Reporter reporter;
-  private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
+  @VisibleForTesting
+  public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
 CONNECTION, WRONG_REDUCE}
-  
-  private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";
+
+  @VisibleForTesting
+  public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";

Review Comment:
   Changing this would mean that the others should be changed as well, and that 
could complicate the diff. Not sure if it's worth it.





> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631149#comment-17631149
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

K0K0V0K commented on code in PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018076397


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService {
 
   public static final HttpResponseStatus TOO_MANY_REQ_STATUS =
   new HttpResponseStatus(429, "TOO MANY REQUESTS");
-  // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
+  // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT
   public static final long FETCH_RETRY_DELAY = 1000L;
   public static final String RETRY_AFTER_HEADER = "Retry-After";
+  static final String ENCODER_HANDLER_NAME = "encoder";
 
   private int port;
-  private ChannelFactory selector;
-  private final ChannelGroup accepted = new DefaultChannelGroup();
+  private EventLoopGroup bossGroup;
+  private EventLoopGroup workerGroup;
+  private ServerBootstrap bootstrap;
+  private Channel ch;
+  private final ChannelGroup accepted =
+  new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next());

Review Comment:
   Maybe there can be a line comment why we have to create 5 event executor



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java:
##
@@ -72,19 +73,21 @@
   private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After";
 
   protected final Reporter reporter;
-  private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
+  @VisibleForTesting
+  public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP,
 CONNECTION, WRONG_REDUCE}
-  
-  private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";
+
+  @VisibleForTesting
+  public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors";

Review Comment:
   check style wont cry for public static final instead of public final static?



##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
##
@@ -904,65 +990,84 @@ private List splitMaps(List mapq) {
 }
 
 @Override
-public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) 
+public void channelActive(ChannelHandlerContext ctx)
 throws Exception {
-  super.channelOpen(ctx, evt);
-
-  if ((maxShuffleConnections > 0) && (accepted.size() >= 
maxShuffleConnections)) {
+  NettyChannelHelper.channelActive(ctx.channel());
+  int numConnections = activeConnections.incrementAndGet();
+  if ((maxShuffleConnections > 0) && (numConnections > 
maxShuffleConnections)) {
 LOG.info(String.format("Current number of shuffle connections (%d) is 
" + 
-"greater than or equal to the max allowed shuffle connections 
(%d)", 
+"greater than the max allowed shuffle connections (%d)",
 accepted.size(), maxShuffleConnections));
 
-Map headers = new HashMap(1);
+Map headers = new HashMap<>(1);
 // notify fetchers to backoff for a while before closing the connection
 // if the shuffle connection limit is hit. Fetchers are expected to
 // handle this notification gracefully, that is, not treating this as a
 // fetch failure.
 headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY));
 sendError(ctx, "", TOO_MANY_REQ_STATUS, headers);
-return;
+  } else {
+super.channelActive(ctx);
+accepted.add(ctx.channel());
+LOG.debug("Added channel: {}, channel id: {}. Accepted number of 
connections={}",
+ctx.channel(), ctx.channel().id(), activeConnections.get());
   }
-  accepted.add(evt.getChannel());
 }
 
 @Override
-public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt)
+public void channelInactive(ChannelHandlerContext ctx) throws Exception {
+  NettyChannelHelper.channelInactive(ctx.channel());
+  super.channelInactive(ctx);
+  int noOfConnections = activeConnections.decrementAndGet();
+  LOG.debug("New value of Accepted number of connections={}", 
noOfConnections);
+}
+
+@Override
+public void channelRead(ChannelHandlerContext ctx, Object msg)
 throws Exception {
-  HttpRequest request = (HttpRequest) evt.getMessage();
-  if (request.getMethod() != GET) {
-  sendError(ctx, METHOD_NOT_ALLOWED);
-  return;
+  Channel channel = ctx.channel();
+  LOG.trace("Executing channelRead, channel id: {}", channel.id());
+  

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-09 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631029#comment-17631029
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Hi,
CC: [~gandras], [~shuzirra], [~weichiu]

Let me summarize what kind of testing I performed to make sure this change 
won't cause any regression.
The project that helped me very much with the testing is called 
[Hades|https://github.com/9uapaw/hades].
Kudos to [~gandras] for the initial work on the Hades project.
h1. TL;DR

*Hades was the framework I used to run my testcases.*
*All testcases are passed both with the trunk version of Hadoop (this is not 
surprising at all) and the deployed Hadoop version with my Netty upgrade patch.*
*See the attached test logs for details.*
*Also see the details below about what Hades is, how I tested, why I chose 
certain configurations for the testcases and many more..*
*Now I'm pretty confident that this patch won't break anything so I'm waiting 
for reviewers.*

h1. HADES IN GENERAL
h2. What is Hades?

Hades is a CLI tool, that shares a common interface between various Hadoop 
distributions. It is a collection of commands most frequently used by 
developers of Hadoop components.

Hades supports [Hadock|https://github.com/9uapaw/docker-hadoop-dev], [Cloduera 
Data Platform|https://www.cloudera.com/products/cloudera-data-platform.html] 
and standard upstream distribution.
h2. Basic features of Hades
 - Discover cluster: Stores where individual YARN / HDFS daemons are running.
 - Distribute files on certain nodes
 - Get config: Prints configuration of selected roles
 - Read logs of Hadoop roles
 - Restart: Restarting of certain roles
 - Run an application on the defined cluster
 - Status: Prints the status of the cluster
 - Update config: Update properties on a config file for selected roles
 - YARN specific commands
 - Run script: Runs user-defined custom scripts against the cluster.

h1. CLUSTER + HADES SETUP
h2. Run Hades with the Netty testing script against a cluster

First of all, I created a standard cluster and deployed Hadoop to the cluster.
Side note: Later on, all the installation that deploys Hadoop on the cluster 
could be part of Hades as well.

It's worth to be mentioned that I have a [PR with netty-related 
changes|https://github.com/9uapaw/hades/pull/6] against the Hades repo.
The branch of this PR is 
[this|https://github.com/szilard-nemeth/hades/tree/netty4-finish].

[Here are the 
instructions|https://github.com/szilard-nemeth/hades/blob/c16e95393ecf3e787e125c58d88ec2dc6a44b9e0/README.md#set-up-hades-on-a-cluster-and-run-the-netty-script]
 for how to set up and run Hades with the Netty testing script.
h1. THE NETTY TESTING SCRIPT

The Netty testing script [lives 
here|https://github.com/szilard-nemeth/hades/blob/netty4-finish/script/netty4.py].
As you can see on the code, quite a lot of work has been done to make sure the 
Netty 4 upgrade won't break anything and won't cause any regression as it is a 
crucial part of MapReduce.
h2. CONCEPTS
h3. Test context

Class: Netty4TestContext

The test context provides a way to encapsulate a base branch and a patch file 
(if any) applied on top of the base branch.
The context can enable or disable Maven compilation.
The context can also have certain ways to ensure that the compilation and the 
deployment of new jars were successful on the cluster.
Now, it can verify that certain logs are appearing in the daemon logs, making 
sure the deployment was okay.
The main purpose of the context is to compare it with results of other contexts.
For the Netty testing, it was evident that I need to make sure the trunk 
version and my version with the patch applied on top of trnuk works the same, 
e.g. there's no regression.
For this, I created the context.
h3. Testcase

Class: Netty4Testcase

In general, a testcase can have a name, a simple name, some config changes 
(dictionary of string keys, string values) and one MR application.
h3. Test config: Config options for running the tests

Class: Netty4TestConfig

These are the main config options for the Netty testing.
I won't go into too much details as I defined a ton of options along the way.
You can check all the config options 
[here|https://github.com/szilard-nemeth/hades/blob/c16e95393ecf3e787e125c58d88ec2dc6a44b9e0/script/netty4.py#L655-L687]
h3. Compiler

As mentioned above, Hades can compile Hadoop with Maven and replace the changed 
jars / Maven modules on the cluster.
This is particularly useful for the Netty testing as I was interested in 
whether the patch causes any issues so I had to compile Hadoop with my Netty 
patch, deploy the jars on the cluster and run all the tests and see all of them 
passing.
h2. TESTCASES

The testcases are defined with the help of the Netty4TestcasesBuilder. You can 
find all the testcases 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17629226#comment-17629226
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

hadoop-yetus commented on PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1304333241

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  3s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 35s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m  0s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  20m 57s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   7m  1s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 32s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   5m 49s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +0 :ok: |  spotbugs  |   0m 51s |  |  
branch/hadoop-client-modules/hadoop-client-runtime no spotbugs output file 
(spotbugsXml.xml)  |
   | +1 :green_heart: |  shadedclient  |  21m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   6m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  22m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 58s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   3m 58s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 21 new + 82 unchanged - 62 fixed = 103 total (was 
144)  |
   | +1 :green_heart: |  mvnsite  |   6m 50s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   5m 42s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   5m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  spotbugs  |   4m 41s | 
[/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html)
 |  hadoop-mapreduce-project/hadoop-mapreduce-client generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | -1 :x: |  spotbugs  |   1m 47s | 
[/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html)
 |  
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   | +0 :ok: |  spotbugs  |   1m  6s |  |  
hadoop-client-modules/hadoop-client-runtime has no data from spotbugs  |
   | +1 :green_heart: |  shadedclient  |  21m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 167m 40s |  |  hadoop-mapreduce-client in the 
patch passed.  |
   | +1 :green_heart: |  unit  |   7m 41s |  |  hadoop-mapreduce-client-core in 
the patch 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17629001#comment-17629001
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth opened a new pull request, #3259:
URL: https://github.com/apache/hadoop/pull/3259

   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628985#comment-17628985
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

szilard-nemeth closed pull request #3259: HADOOP-15327. Upgrade MR 
ShuffleHandler to use Netty4
URL: https://github.com/apache/hadoop/pull/3259




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-09-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600654#comment-17600654
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

ashutoshcipher commented on PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1237792070

   Thanks @szilard-nemeth for working on this. Looks good to me. Looking 
forward for the final patch.




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2022-08-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582976#comment-17582976
 ] 

ASF GitHub Bot commented on HADOOP-15327:
-

9uapaw commented on PR #3259:
URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1222360570

   Latest change looks good to me, the shading issue is gone now. Once spotbugs 
and checkstyle errors are fixed, we are good to go.




> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-10-31 Thread Jason Wen (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436592#comment-17436592
 ] 

Jason Wen commented on HADOOP-15327:


Hi [~snemeth], I can help to backport the fix to 3.2.x branch to see if it has 
same maven shading issue. If same issue exists I can help to fix the shading 
issue.

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-10-20 Thread Ayush Pal (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17431113#comment-17431113
 ] 

Ayush Pal commented on HADOOP-15327:


Hi [~snemeth] ,

I referred PR provided in this jira for netty upgrade for fixing Netty CVE 
vulnerability and did some fixes for maven shading issues. I am happy to 
contribute the patch to open source. let me know your thoughts.

Thanks

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-10-18 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430075#comment-17430075
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Hi [~zhenshan.wen] ,

I'm planning to fix the maven shading issue in the coming weeks, as soon as 
possible.

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-10-06 Thread Jason Wen (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425285#comment-17425285
 ] 

Jason Wen commented on HADOOP-15327:


What's the current status of the PR? Will the fix be included in any upcoming 
release?

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-08-04 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17393023#comment-17393023
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Converted to a PR, no more patches will be uploaded to this jira.

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, 
> HADOOP-15327.005.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, 
> testfailure-testMapFileAccess-emptyresponse.zip, 
> testfailure-testReduceFromPartialMem.zip
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-07-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387909#comment-17387909
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 
41s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 4 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
13s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
17s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
36s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
30s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
25m 48s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
28s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
10s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 40m 
45s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
28s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
36s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 
13s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 25m 13s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/216/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color}
 | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 1912 unchanged - 0 
fixed = 1913 total (was 1912) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
33s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m 33s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/216/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color}
 | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with 
JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 1790 
unchanged - 0 fixed = 1791 total (was 1790) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-07-26 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387643#comment-17387643
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 27m  
2s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 4 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 
16s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 35m 
52s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
49s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
10s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
31s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
31m 18s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
51s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
21s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 50m 
54s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
42s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 
23s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 25m 23s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/215/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color}
 | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 1912 unchanged - 0 
fixed = 1913 total (was 1912) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
34s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m 34s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/215/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color}
 | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with 
JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 1790 
unchanged - 0 fixed = 1791 total (was 1790) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-29 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371468#comment-17371468
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Just uploaded a new patch: [^HADOOP-15327.005.patch]

I have been (almost) exclusively working on this since my last comment and 
there are a couple of things to add again.
 The last commit that was discussed is this: 
[https://github.com/szilard-nemeth/hadoop/commit/f149be8de28baafc64eed1c47e788f5beb215e62]
 Let me explain what've changed commit by commit. I will skip a bunch of 
trivial ones like code cleanup, added comments and the like.
 *I will cover the test failures surfaced by Jenkins build / unit test results:*
 - Build #1
 - Build #2

*1. TestShuffleHandler: Introduced InputStreamReadResult that stores response 
as string + total bytes read: 
[https://github.com/szilard-nemeth/hadoop/commit/a57de573c97fe12c9071dd3450df8f450bf075ea]*
 Here, I added a new class called 'InputStreamReadResult' that stores the bytes 
read (byte[]) and the number of bytes read from a response InputStream.
 This improves the way testcases can assert on these data.

*2. TestShuffleHandler: Use DEFAULT_PORT for all shuffle handler port configs: 
[https://github.com/szilard-nemeth/hadoop/commit/78b1166866c85cab6860407f8fe4a4ddc3168fae]*
 It was a common pitfall while debugging that the tests had to modified to use 
a certain fixed port. Here, I added a constant to store the port number so when 
I had to debug I only needed to change it in one single place.

*3. Create class: TestExecution: Configure proxy, keep alive connection 
timeout: 
[https://github.com/szilard-nemeth/hadoop/commit/fa5bb32ae4eb737077a165b3b1fba5069c982243]*
 In order to debug the HTTP responses, I found it convenient to add a helper 
class that is responsible for the following:
 - Configuring the HTTP connections, use a proxy when required
 - Increase the keepalive timeout when using DEBUG mode

TEST_EXECUTION is a static instance of TestExecution, initialized with a JUnit 
test setup method.
 There are 2 flags that control the behaviour of this object:
{code:java}
//Control test execution properties with these flags
private static final boolean DEBUG_MODE = true;
//If this is set to true and proxy server is not running, tests will fail!
private static final boolean USE_PROXY = false; 
{code}
The only difference on top of these is in the code of testcases: They create 
all HTTP connections with:
{code:java}
TEST_EXECUTION.openConnection(url)
{code}
*4. TestExecution: Configure port: 
[https://github.com/szilard-nemeth/hadoop/commit/4a5c035695be1099bff4a633cd605b9f8146d841]*
 One addition to 3. is to include the port used by ShuffleHandler in the 
TestExecution object. When using DEBUG mode, the port is fixed to a value, 
otherwise it is set to 0, meaning that the port will be dynamically chosen.

*5. Add logging response encoder to TestShuffleHandler.testMapFileAccess: 
[https://github.com/szilard-nemeth/hadoop/commit/64686b47d2fed4e923c1c9c0169a06aba3e339be]*
 While debugging TestShuffleHandler#testMapFileAccess, just realized that I 
forgot to add the LoggingHttpResponseEncoder to the pipeline. The most trivial 
way was to modify the pipeline when the channel is activated.

*6. TestShuffleHandler.testMapFileAccess: Modify to be able to run it locally + 
reproduce jenkins UT failure: 
[https://github.com/szilard-nemeth/hadoop/commit/bb0fcbbd7dcbe3fa7efd1b6a8c2eb8a9055c5ecd]*
 Here's where the fun begins. The problem with 
TestShuffleHandler#testMapFileAccess is that it requires the NativeIO module:
{code:java}
// This will run only in NativeIO is enabled as SecureIOUtils need it
assumeTrue(NativeIO.isAvailable());
{code}
I tried to compile the Hadoop Native libraries on my Mac according to these 
resources:
 - Native libraries: 
[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html]
 - Followed this guide: 
[https://dev.to/zejnilovic/building-hadoop-native-libraries-on-mac-in-2019-1iee]

Unfortunately, I still had compilation errors so I eventually gave up and 
tweaked the test to be able to run it locally. This wasn't such a complex 
thing, I don't think it's worth to go into the details, had to comment out some 
test code that used the Native library and that was all.
 From the Jenkins results I had this:
{code:java}
[INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ 
hadoop-mapreduce-client-shuffle ---
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.mapred.TestFadvisedFileRegion
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.493 s 
- in org.apache.hadoop.mapred.TestFadvisedFileRegion
[INFO] Running org.apache.hadoop.mapred.TestShuffleHandler
[ERROR] Tests run: 15, Failures: 1, Errors: 0, 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-23 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367912#comment-17367912
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Hey [~weichiu],
Thanks for putting the exceprt here. This could be fixed in parallel, I would 
be glad if you could point me to the config that needs to be changed.
Currently, I'm working on the test issues produced by the build that ran 
against patch003: 
hadoop.mapred.TestReduceFetchFromPartialMem
hadoop.mapred.TestReduceFetch
There are jiras related to these tests but checked the logs and saw very 
suspicious things and it pointed me to a code defect.
I will upload a next patch soon along with explanation of what has been changed 
since patch004.
Hopefully, this can be the last one and I can finally start testing on a 
cluster. I wouldn't expect any production issues (fingers crossed) as test 
coverage is quite good and while I have been fixing the tests, I gained a lot 
of code knowledge, almost being familiar with the ShuffleHandler inside and out.

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log
>
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-22 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367842#comment-17367842
 ] 

Wei-Chiu Chuang commented on HADOOP-15327:
--

This is an excerpt of the error:

{noformat}
  Found in:
org.apache.hadoop:hadoop-client-minicluster:jar:3.4.0-SNAPSHOT:compile
org.apache.hadoop:hadoop-client-runtime:jar:3.4.0-SNAPSHOT:compile
  Duplicate classes:

org/apache/hadoop/shaded/io/netty/handler/codec/mqtt/MqttSubscriptionOption$RetainedHandlingPolicy.class
org/apache/hadoop/shaded/io/netty/buffer/ByteBufProcessor$3.class

org/apache/hadoop/shaded/io/netty/handler/codec/http2/Http2Headers$PseudoHeaderName.class

org/apache/hadoop/shaded/io/netty/handler/codec/http2/StreamBufferingEncoder$Http2GoAwayException.class

org/apache/hadoop/shaded/io/netty/channel/epoll/AbstractEpollChannel$AbstractEpollUnsafe$3.class
org/apache/hadoop/shaded/io/netty/handler/codec/spdy/SpdySession.class
org/apache/hadoop/shaded/io/netty/channel/udt/UdtServerChannelConfig.class
org/apache/hadoop/shaded/io/netty/handler/codec/http2/Http2Flags.class
{noformat}

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, HADOOP-15327.004.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log
>
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-21 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366722#comment-17366722
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 
29s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
50s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
49s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
15s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 45s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 
12s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
29s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
12s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
59s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 
59s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
12s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 
12s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 56s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/203/artifact/out/diff-checkstyle-root.txt{color}
 | {color:orange} root: The patch generated 116 new + 83 unchanged - 7 fixed = 
199 total (was 90) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-15 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363928#comment-17363928
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
28s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
34s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 7s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
5s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m 42s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
46s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 36m  
7s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
35s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
44s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
47s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
47s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 
44s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m 
44s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
6m  1s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/200/artifact/out/diff-checkstyle-root.txt{color}
 | {color:orange} root: The patch generated 116 new + 83 unchanged - 7 fixed = 
199 total (was 90) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
26s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-12 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362456#comment-17362456
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
25s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
18s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
15s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 2s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
58s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 43s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
12s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
0s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 
17s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
28s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
11s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
57s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 
57s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
19s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 
19s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 57s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/199/artifact/out/diff-checkstyle-root.txt{color}
 | {color:orange} root: The patch generated 93 new + 83 unchanged - 7 fixed = 
176 total (was 90) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
55s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-12 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362368#comment-17362368
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

*Remaining TODO items that I can make progress with:*
 - Fix failing unit tests
 - Testing on cluster

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, 
> HADOOP-15327.003.patch, 
> getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log
>
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-12 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362367#comment-17362367
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

The latest patch contains commits from this branch: 
[https://github.com/szilard-nemeth/hadoop/commits/HADOOP-15327-snemeth]
 There are a couple of commits so I would approach this by explaning the 
reasons behind each change in the commits.
 Not all commits are listed, I left out a few trivial ones.
 Unfortunately, this task was a bit tricky as everytime I touched something in 
the test, I just found another bug or weird behaviour so it took a great deal 
of time to solve and discover everything.

*1. ShuffleHandler: ch.isOpen() --> ch.isActive(): 
[https://github.com/szilard-nemeth/hadoop/commit/e703adb57f66da8579baa26257ca9aaed2bf1db5]*
 This was already mentioned with my previous lenghtier comment.

*2. TestShuffleHandler: Fix mocking in testSendMapCount + replace ch.write() 
with ch.writeAndFlush(): 
[https://github.com/szilard-nemeth/hadoop/commit/07fbfee5cae85e8e374b53c303e794c19c620efc]*
 This is about 2 things:
 - Replacing channel.write calls with channel.writeAndFlush
 - Fixing bad mocking in 
org.apache.hadoop.mapred.TestShuffleHandler#testSendMapCount

*3. TestShuffleHandler.testMaxConnections: Rewrite test + production code: 
accepted connection handling: 
[https://github.com/szilard-nemeth/hadoop/commit/def0059982ef8f0e2f19d385b1a1fcdca8639f9d]*
 *Changes in production code:*
 - ShuffleHandler#channelActive added the channel to the channel group (field 
called 'accepted') before the if statement that enforces the maximum number of 
open connections. This was the old, wrong piece of code:
{code:java}
 super.channelActive(ctx);
  LOG.debug("accepted connections={}", accepted.size());

  if ((maxShuffleConnections > 0) && (accepted.size() >= 
maxShuffleConnections)) {
{code}

 - Also, counting the number of open channels with the channel group was 
unreliable so I introduced a new AtomicInteger field called 
'acceptedConnections' to track the open channels / connections.
 - There was another issue: When the channels were accepted, the counter of 
open channels was increased but when channels were inactivated I could not see 
any code that would have maintained (decremented) the value.
 This was mitigated by adding 
org.apache.hadoop.mapred.ShuffleHandler.Shuffle#channelInactive that logs the 
channel inactivated event and decreases the open connections counter:
{code:java}
@Override
public void channelInactive(ChannelHandlerContext ctx) throws Exception {
  super.channelInactive(ctx);
  acceptedConnections.decrementAndGet();
  LOG.debug("New value of Accepted number of connections={}",
  acceptedConnections.get());
}
{code}

*Changes in test code:*
 - org.apache.hadoop.mapred.TestShuffleHandler#testMaxConnections: Fixed the 
testcase, the issue was pointed out correctly by [~weichiu] : The connections 
are accepted in parallel so we should not rely on their order in the test. The 
way I rewritten this is that I introduced a map to group HttpURLConnection 
objects by their HTTP response code.
 Then I check if we only have 200 OK and 429 TOO MANY REQUESTS, and check if 
the number of 200 OK connections is 2 and there's only one unaccepted 
connection.

*4. increase netty version to 4.1.65.Final: 
[https://github.com/szilard-nemeth/hadoop/commit/4f4589063b579a93389b1e188c29bd895ae507fc]*
 This is a simple commit to increase the Netty version to the latest stable 4.x 
version.
 See this page: [https://netty.io/downloads.html]
 It states: "netty-4.1.65.Final.tar.gz ‐ 19-May-2021 (Stable, Recommended)"

*5. ShuffleHandler: Fix keepalive test + writing HTTP response properly to 
channel: 
[https://github.com/szilard-nemeth/hadoop/commit/1aad4eaace28cfff4a9a9152f7535d70cc6e3734]*
 This is where things get more interesting. There was a testcase called 
org.apache.hadoop.mapred.TestShuffleHandler#testKeepAlive that caught an issue 
that came up because Netty 4.x handles HTTP responses written to the same 
channel differently than Netty 3.x.
 See details below.

Production code changes:
 - Added some logs to be able to track what happened when utilizing HTTP 
Connection Keep-alive.
 - Added a ChannelOutboundHandlerAdapter that handles exceptions that happens 
during outbound message construction. This is by default not logged by Netty 
and I only found this trick to catch these events:
{code:java}
  pipeline.addLast("outboundExcHandler", new 
ChannelOutboundHandlerAdapter() {
@Override
public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise 
promise) throws Exception {
  promise.addListener(ChannelFutureListener.FIRE_EXCEPTION_ON_FAILURE);
  super.write(ctx, msg, promise);
}
  });
{code}
This solution is described here: 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-12 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362339#comment-17362339
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
22s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 
32s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
24s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
40s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
13s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 1s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
59s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 51s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
12s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 
22s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
31s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
13s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m  
6s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 22m  
6s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
19s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 
19s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 58s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/197/artifact/out/diff-checkstyle-root.txt{color}
 | {color:orange} root: The patch generated 95 new + 83 unchanged - 7 fixed = 
178 total (was 90) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
57s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-11 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362225#comment-17362225
 ] 

Hadoop QA commented on HADOOP-15327:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m  
4s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 
46s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 3s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
34s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
17s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
56s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
56s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 38s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
14s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m  
7s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
31s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no 
spotbugs output file (spotbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
11s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m  
3s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 22m  3s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/196/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color}
 | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 1981 unchanged - 0 
fixed = 1982 total (was 1981) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
19s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 19m 19s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/196/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color}
 | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with 
JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 1858 
unchanged - 0 fixed = 1859 total (was 1858) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | 

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-11 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362029#comment-17362029
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Thanks [~weichiu] for your help.
 Added a preliminary patch to kick-off Jenkins.
 Haven't touch shaded config so I'm expecting a Maven error from Jenkins.
 Referring back to your comment here: I'm quite a beginner with shading. Can 
you or anyone else help me out with this?

*Remaining TODO items:*
 - Testing on cluster
 - Adding explanation comment for the new code changes.
 So a more lengthy comment will follow this :)

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: HADOOP-15327.001.patch
>
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-08 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359058#comment-17359058
 ] 

Wei-Chiu Chuang commented on HADOOP-15327:
--

Thanks...

1. Change category #5, closing channels with:

future.channel().closeFuture().awaitUninterruptibly()
Could you please tell me how did you find out to close a channel like this? As 
mentioned above, I couldn't find any resource to justify this.

Channel.close() in netty4 takes a Future parameter. 

But In fact, this is a bad practice according to the official doc:
https://netty.io/4.0/api/io/netty/channel/ChannelFuture.html
"Do not call await() inside ChannelHandler"


> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-07 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358861#comment-17358861
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Let me list the differences introduced because of the migration from Netty 3.x 
to 4.x.
 There is a migration guide that mentions most (but not all) of the changes: 
[https://netty.io/wiki/new-and-noteworthy-in-4.0.html]
 Please note that the below code changes are based on Wei-Chiu's branch: 
[https://github.com/jojochuang/hadoop/commits/shuffle_handler_netty4]
h2. CHANGES IN ShuffleHandler
h3. *I will list the changes mostly from ShuffleHandler as it covers almost all 
type of changes in other classes as well.*

*In TestShuffleHandler, the test code was changed by any of the justifications 
listed down below.*
h3. Change category #1: General API changes / non-configuration getters:

Details: 
[https://netty.io/wiki/new-and-noteworthy-in-4.0.html#general-api-changes]
{quote}Non-configuration getters have no get- prefix anymore. (e.g. 
Channel.getRemoteAddress() → Channel.remoteAddress())
 Boolean properties are still prefixed with is- to avoid confusion (e.g. 
'empty' is both an adjective and a verb, so empty() can have two meanings.)
{quote}
I'm just listing all the changes without additional context (in which method 
they were changed) separated by three dots, as they are simply method renamings:
{code:java}
-future.getChannel().close();
+future.channel().closeFuture().awaitUninterruptibly();
...
...
-  ChannelPipeline pipeline = future.getChannel().getPipeline();
+  ChannelPipeline pipeline = future.channel().pipeline();
...
...
-port = ((InetSocketAddress)ch.getLocalAddress()).getPort();
+port = ((InetSocketAddress)ch.localAddress()).getPort();
...
...
-  if (e.getState() == IdleState.WRITER_IDLE && enabledTimeout) {
-e.getChannel().close();
+  if (e.state() == IdleState.WRITER_IDLE && enabledTimeout) {
+ctx.channel().close();
...
...
-  accepted.add(evt.getChannel());
+  accepted.add(ctx.channel());
...
...
-new QueryStringDecoder(request.getUri()).getParameters();
+new QueryStringDecoder(request.getUri()).parameters(); //getUri was 
not changed, see this later
...
...
-  Channel ch = evt.getChannel();
-  ChannelPipeline pipeline = ch.getPipeline();
+  Channel ch = ctx.channel();
+  ChannelPipeline pipeline = ch.pipeline();
...
...
-  reduceContext.getCtx().getChannel(),
+  reduceContext.getCtx().channel(),
...
...
-  if (ch.getPipeline().get(SslHandler.class) == null) {
+  if (ch.pipeline().get(SslHandler.class) == null) {
...
...
-  Channel ch = evt.getChannel();
-  ChannelPipeline pipeline = ch.getPipeline();
+  Channel ch = ctx.channel();
+  ChannelPipeline pipeline = ch.pipeline();
...
...
-  
ctx.getChannel().write(response).addListener(ChannelFutureListener.CLOSE);
+  ctx.channel().write(response).addListener(ChannelFutureListener.CLOSE);
...
...
-  Channel ch = e.getChannel();
-  Throwable cause = e.getCause();
+  Channel ch = ctx.channel();
{code}
h3. Change category #2: General API changes / Method signature changes.

*2.1: SimpleChannelUpstreamHandler was renamed to ChannelInboundHandlerAdapter.*
 
[https://netty.io/wiki/new-and-noteworthy-in-4.0.html#upstream--inbound-downstream--outbound]
{quote}The terms 'upstream' and 'downstream' were pretty confusing to 
beginners. 4.0 uses 'inbound' and 'outbound' wherever possible.
{quote}
{code:java}
-  class Shuffle extends SimpleChannelUpstreamHandler {
+  @ChannelHandler.Sharable
+  class Shuffle extends ChannelInboundHandlerAdapter {
{code}
*2.2: Simplifed channel state model: 
[https://netty.io/wiki/new-and-noteworthy-in-4.0.html#simplified-channel-state-model]*
{quote}channelOpen, channelBound, and channelConnected have been merged to 
channelActive. channelDisconnected, channelUnbound, and channelClosed have been 
merged to channelInactive. Likewise, Channel.isBound() and isConnected() have 
been merged to isActive().
{quote}
*2.2.1 Changes in class: Shuffle*
{code:java}
 @Override
-public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) 
+public void channelActive(ChannelHandlerContext ctx)
 throws Exception {
-  super.channelOpen(ctx, evt);
+  super.channelActive(ctx);
{code}
*2.2.2 Changes in 
org.apache.hadoop.mapred.ShuffleHandler.Shuffle#exceptionCaught:* 
 Quoting the change again:
{quote}channelOpen, channelBound, and channelConnected have been merged to 
channelActive. channelDisconnected, channelUnbound, and channelClosed have been 
merged to channelInactive. Likewise, Channel.isBound() and isConnected() have 
been merged to isActive().
{quote}
{code:java}
   LOG.error("Shuffle error: ", cause);
-  if (ch.isConnected()) {
-LOG.error("Shuffle error " + e);
+   

[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-04 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357627#comment-17357627
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Thanks [~weichiu],
Will use skipShade temporarily then will check how to resolve the shading issue 
once all code issues are fixed and in place.

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356433#comment-17356433
 ] 

Wei-Chiu Chuang commented on HADOOP-15327:
--

Thanks for reporting the issue... I usually skip shading (-DskipShade) to build 
during development so didn't notice it. We probably need to update our shading 
rule to exclude netty (shading and relocating netty is a delicate art)

In addition to MR unit tests, if you can stand up a cluster, run the MR 
shuffler with the netty leak detection option (java 
-Dio.netty.leakDetection.level=advanced) enabled, we'll have more confidence in 
it. https://github.com/apache/hadoop/pull/2832#issuecomment-818602960

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-03 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356401#comment-17356401
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Hmm, can't attach the file, probably it's a permission issue. I don't want to 
paste 2000+ lines here.
Uploaded the file to my personal Google Drive: 
https://drive.google.com/file/d/1-ovH8snqTS73oLNsxtwgrvaDVynOBIP7/view?usp=sharing

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Szilard Nemeth
>Priority: Major
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-06-03 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356377#comment-17356377
 ] 

Szilard Nemeth commented on HADOOP-15327:
-

Hi [~weichiu],

I took over this Jira as it was unassigned, I hope it's not a problem.

The plan is to continue the work based on your code changes until all the UT 
issues are fixed.

Did you have any plan for testing? Is it enough to test basic MR-based jobs 
with shuffling?

One more thing: The latest version on your branch 
([https://github.com/jojochuang/hadoop/commits/shuffle_handler_netty4)] 
produced Maven enforcer issues for me.

The command I'm using to build Hadoop (from its root): 
{code}
mvn clean install -Pdist -DskipTests  -Dmaven.javadoc.skip=true -e | tee 
/tmp/maven_out
{code}
Please see the attached output file. 

Did you encounter similar build issues? I can see the latest commit is form 
March 2021 so it's not that old so I don't assume the build system + enforcing 
rules were changed.
 

 

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Priority: Major
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4

2021-03-31 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312744#comment-17312744
 ] 

Wei-Chiu Chuang commented on HADOOP-15327:
--

Here's my WIP patch for this migration:

https://github.com/jojochuang/hadoop/commits/shuffle_handler_netty4

The TestShuffleHandler is having a few failures.

The testMaxConnections UT failure is apparently caused by either a bad test or 
a behavior change in netty. The test expects connections to be received 
sequentially but connections are actually received in parallel. No order can be 
assumed. 

Other UTs in TestShuffleHandler are caused by socket timeouts. The clients 
receive the header of the response from server, but not the content. I 
suspected it was caused by Channel.write() API where it doesn't automatically 
reflush after write in netty4. I switched to writeAndFlush() but problems still 
persist.

> Upgrade MR ShuffleHandler to use Netty4
> ---
>
> Key: HADOOP-15327
> URL: https://issues.apache.org/jira/browse/HADOOP-15327
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Wei-Chiu Chuang
>Priority: Major
>
> This way, we can remove the dependencies on the netty3 (jboss.netty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org