[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783867#comment-17783867 ] Duo Zhang commented on HADOOP-15327: So do we have any plan on backporting this to branch-3.3? Netty 3 has a bunch of CVEs... > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636482#comment-17636482 ] ASF GitHub Bot commented on HADOOP-15327: - aajisaka closed pull request #3259: HADOOP-15327. Upgrade MR ShuffleHandler to use Netty4 URL: https://github.com/apache/hadoop/pull/3259 > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636481#comment-17636481 ] ASF GitHub Bot commented on HADOOP-15327: - aajisaka commented on PR #3259: URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1321546109 Closing this PR as it's already merged into trunk. Thank you. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636076#comment-17636076 ] ASF GitHub Bot commented on HADOOP-15327: - jojochuang commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1026831834 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -18,19 +18,20 @@ package org.apache.hadoop.mapred; +import static io.netty.buffer.Unpooled.wrappedBuffer; +import static io.netty.handler.codec.http.HttpHeaderNames.CONTENT_TYPE; +import static io.netty.handler.codec.http.HttpMethod.GET; +import static io.netty.handler.codec.http.HttpResponseStatus.BAD_REQUEST; +import static io.netty.handler.codec.http.HttpResponseStatus.FORBIDDEN; +import static io.netty.handler.codec.http.HttpResponseStatus.INTERNAL_SERVER_ERROR; +import static io.netty.handler.codec.http.HttpResponseStatus.METHOD_NOT_ALLOWED; +import static io.netty.handler.codec.http.HttpResponseStatus.NOT_FOUND; +import static io.netty.handler.codec.http.HttpResponseStatus.OK; +import static io.netty.handler.codec.http.HttpResponseStatus.UNAUTHORIZED; +import static io.netty.handler.codec.http.HttpVersion.HTTP_1_1; +import static org.apache.hadoop.mapred.ShuffleHandler.NettyChannelHelper.*; Review Comment: let's not use wildcard import ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java: ## @@ -315,9 +318,8 @@ protected void copyFromHost(MapHost host) throws IOException { return; } -if(LOG.isDebugEnabled()) { - LOG.debug("Fetcher " + id + " going to fetch from " + host + " for: " -+ maps); +if (LOG.isDebugEnabled()) { + LOG.debug("Fetcher " + id + " going to fetch from " + host + " for: " + maps); Review Comment: slf4j logger messages can be rewritten using parameterized logging format. But let's not worry about that now. This PR is already too big. ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -1322,48 +1438,45 @@ protected void sendError(ChannelHandlerContext ctx, String msg, for (Map.Entry header : headers.entrySet()) { response.headers().set(header.getKey(), header.getValue()); } - response.setContent( - ChannelBuffers.copiedBuffer(msg, CharsetUtil.UTF_8)); // Close the connection as soon as the error message is sent. - ctx.getChannel().write(response).addListener(ChannelFutureListener.CLOSE); + writeToChannelAndClose(ctx.channel(), response); } @Override -public void exceptionCaught(ChannelHandlerContext ctx, ExceptionEvent e) +public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception { - Channel ch = e.getChannel(); - Throwable cause = e.getCause(); + Channel ch = ctx.channel(); if (cause instanceof TooLongFrameException) { +LOG.trace("TooLongFrameException, channel id: {}", ch.id()); sendError(ctx, BAD_REQUEST); return; } else if (cause instanceof IOException) { if (cause instanceof ClosedChannelException) { - LOG.debug("Ignoring closed channel error", cause); + LOG.debug("Ignoring closed channel error, channel id: " + ch.id(), cause); Review Comment: Use parameterized logging, otherwise wrap it in a LOG.isDebugEnabled() check. ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -904,65 +990,84 @@ private List splitMaps(List mapq) { } @Override -public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) +public void channelActive(ChannelHandlerContext ctx) throws Exception { - super.channelOpen(ctx, evt); - - if ((maxShuffleConnections > 0) && (accepted.size() >= maxShuffleConnections)) { + NettyChannelHelper.channelActive(ctx.channel()); + int numConnections = activeConnections.incrementAndGet(); + if ((maxShuffleConnections > 0) && (numConnections > maxShuffleConnections)) { Review Comment: Perhaps these channel bookeeping is no longer needed given that channel size is limited. ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService { public static final HttpResponseStatus TOO_MANY_REQ_STATUS = new HttpResponseStatus(429, "TOO MANY REQUESTS"); - // This should kept in sync with
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17632150#comment-17632150 ] ASF GitHub Bot commented on HADOOP-15327: - brumi1024 commented on PR #3259: URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1311363399 Merged to trunk. Thanks @szilard-nemeth for the patch and @shuzirra @9uapaw @K0K0V0K for the review! > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631957#comment-17631957 ] ASF GitHub Bot commented on HADOOP-15327: - hadoop-yetus commented on PR #3259: URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1311003083 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 38s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 9s | | trunk passed | | +1 :green_heart: | compile | 25m 7s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 21m 40s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 18s | | trunk passed | | +1 :green_heart: | mvnsite | 6m 52s | | trunk passed | | +1 :green_heart: | javadoc | 5m 12s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 4m 39s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +0 :ok: | spotbugs | 0m 55s | | branch/hadoop-client-modules/hadoop-client-runtime no spotbugs output file (spotbugsXml.xml) | | +1 :green_heart: | shadedclient | 21m 41s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 6m 35s | | the patch passed | | +1 :green_heart: | compile | 24m 26s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 24m 26s | | the patch passed | | +1 :green_heart: | compile | 22m 4s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 22m 4s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 4m 1s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/results-checkstyle-root.txt) | root: The patch generated 21 new + 82 unchanged - 62 fixed = 103 total (was 144) | | +1 :green_heart: | mvnsite | 7m 8s | | the patch passed | | +1 :green_heart: | javadoc | 5m 38s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 5m 58s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 4m 45s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html) | hadoop-mapreduce-project/hadoop-mapreduce-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | -1 :x: | spotbugs | 1m 44s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/2/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html) | hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +0 :ok: | spotbugs | 0m 56s | | hadoop-client-modules/hadoop-client-runtime has no data from spotbugs | | +1 :green_heart: | shadedclient | 21m 11s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 166m 16s | | hadoop-mapreduce-client in the patch passed. | | +1 :green_heart: | unit | 8m 10s | | hadoop-mapreduce-client-core in the patch
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631652#comment-17631652 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019105924 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/resources/log4j.properties: ## @@ -17,3 +17,5 @@ log4j.threshold=ALL log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p [%t] %c{2} (%F:%M(%L)) - %m%n +log4j.logger.io.netty=DEBUG +log4j.logger.org.apache.hadoop.mapred=DEBUG Review Comment: Thanks, changed the level to INFO. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631651#comment-17631651 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019104663 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java: ## @@ -668,34 +1357,61 @@ protected ChannelFuture sendMapOutput(ChannelHandlerContext ctx, conns[i].connect(); } -//Ensure first connections are okay -conns[0].getInputStream(); -int rc = conns[0].getResponseCode(); -Assert.assertEquals(HttpURLConnection.HTTP_OK, rc); - -conns[1].getInputStream(); -rc = conns[1].getResponseCode(); -Assert.assertEquals(HttpURLConnection.HTTP_OK, rc); - -// This connection should be closed because it to above the limit -try { - rc = conns[2].getResponseCode(); - Assert.assertEquals("Expected a too-many-requests response code", - ShuffleHandler.TOO_MANY_REQ_STATUS.getCode(), rc); - long backoff = Long.valueOf( - conns[2].getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER)); - Assert.assertTrue("The backoff value cannot be negative.", backoff > 0); - conns[2].getInputStream(); - Assert.fail("Expected an IOException"); -} catch (IOException ioe) { - LOG.info("Expected - connection should not be open"); -} catch (NumberFormatException ne) { - Assert.fail("Expected a numerical value for RETRY_AFTER header field"); -} catch (Exception e) { - Assert.fail("Expected a IOException"); +Map> mapOfConnections = Maps.newHashMap(); +for (HttpURLConnection conn : conns) { + try { +conn.getInputStream(); + } catch (IOException ioe) { +LOG.info("Expected - connection should not be open"); + } catch (NumberFormatException ne) { +fail("Expected a numerical value for RETRY_AFTER header field"); + } catch (Exception e) { +fail("Expected a IOException"); + } + int statusCode = conn.getResponseCode(); + LOG.debug("Connection status code: {}", statusCode); + mapOfConnections.putIfAbsent(statusCode, new ArrayList<>()); + List connectionList = mapOfConnections.get(statusCode); + connectionList.add(conn); } + +assertEquals(String.format("Expected only %s and %s response", +OK_STATUS, ShuffleHandler.TOO_MANY_REQ_STATUS), +Sets.newHashSet( +HttpURLConnection.HTTP_OK, +ShuffleHandler.TOO_MANY_REQ_STATUS.code()), +mapOfConnections.keySet()); -shuffleHandler.stop(); +List successfulConnections = +mapOfConnections.get(HttpURLConnection.HTTP_OK); +assertEquals(String.format("Expected exactly %d requests " + +"with %s response", maxAllowedConnections, OK_STATUS), +maxAllowedConnections, successfulConnections.size()); + +//Ensure exactly one connection is HTTP 429 (TOO MANY REQUESTS) +List closedConnections = +mapOfConnections.get(ShuffleHandler.TOO_MANY_REQ_STATUS.code()); +assertEquals(String.format("Expected exactly %d %s response", +notAcceptedConnections, ShuffleHandler.TOO_MANY_REQ_STATUS), +notAcceptedConnections, closedConnections.size()); + +// This connection should be closed because it is above the maximum limit +HttpURLConnection conn = closedConnections.get(0); +assertEquals(String.format("Expected a %s response", +ShuffleHandler.TOO_MANY_REQ_STATUS), +ShuffleHandler.TOO_MANY_REQ_STATUS.code(), conn.getResponseCode()); +long backoff = Long.parseLong( +conn.getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER)); +assertTrue("The backoff value cannot be negative.", backoff > 0); + +shuffleHandler.stop(); + +//It's okay to get a ClosedChannelException. +//All other kinds of exceptions means something went wrong +assertEquals("Should have no caught exceptions", +Collections.emptyList(), failures.stream() +.filter(f -> !(f instanceof ClosedChannelException)) +.collect(toList())); Review Comment: Thanks for this comment. It's indeed a wrong thing that the InputStream is not closed. I'd suggest to create a separate jira to refactor this class, as it's currently in a horrible shape. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments:
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631643#comment-17631643 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019075009 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -904,65 +990,84 @@ private List splitMaps(List mapq) { } @Override -public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) +public void channelActive(ChannelHandlerContext ctx) throws Exception { - super.channelOpen(ctx, evt); - - if ((maxShuffleConnections > 0) && (accepted.size() >= maxShuffleConnections)) { + NettyChannelHelper.channelActive(ctx.channel()); + int numConnections = activeConnections.incrementAndGet(); + if ((maxShuffleConnections > 0) && (numConnections > maxShuffleConnections)) { LOG.info(String.format("Current number of shuffle connections (%d) is " + -"greater than or equal to the max allowed shuffle connections (%d)", +"greater than the max allowed shuffle connections (%d)", accepted.size(), maxShuffleConnections)); -Map headers = new HashMap(1); +Map headers = new HashMap<>(1); // notify fetchers to backoff for a while before closing the connection // if the shuffle connection limit is hit. Fetchers are expected to // handle this notification gracefully, that is, not treating this as a // fetch failure. headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY)); sendError(ctx, "", TOO_MANY_REQ_STATUS, headers); -return; + } else { +super.channelActive(ctx); +accepted.add(ctx.channel()); +LOG.debug("Added channel: {}, channel id: {}. Accepted number of connections={}", +ctx.channel(), ctx.channel().id(), activeConnections.get()); } - accepted.add(evt.getChannel()); } @Override -public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt) +public void channelInactive(ChannelHandlerContext ctx) throws Exception { + NettyChannelHelper.channelInactive(ctx.channel()); + super.channelInactive(ctx); + int noOfConnections = activeConnections.decrementAndGet(); + LOG.debug("New value of Accepted number of connections={}", noOfConnections); +} + +@Override +public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception { - HttpRequest request = (HttpRequest) evt.getMessage(); - if (request.getMethod() != GET) { - sendError(ctx, METHOD_NOT_ALLOWED); - return; + Channel channel = ctx.channel(); + LOG.trace("Executing channelRead, channel id: {}", channel.id()); + HttpRequest request = (HttpRequest) msg; + LOG.debug("Received HTTP request: {}, channel id: {}", request, channel.id()); + if (request.method() != GET) { +sendError(ctx, METHOD_NOT_ALLOWED); +return; } // Check whether the shuffle version is compatible - if (!ShuffleHeader.DEFAULT_HTTP_HEADER_NAME.equals( - request.headers() != null ? - request.headers().get(ShuffleHeader.HTTP_HEADER_NAME) : null) - || !ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION.equals( - request.headers() != null ? - request.headers() - .get(ShuffleHeader.HTTP_HEADER_VERSION) : null)) { + String shuffleVersion = ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION; + String httpHeaderName = ShuffleHeader.HTTP_HEADER_NAME; Review Comment: Good catch, fixed. please check the new code. I agree with @brumi1024 , the request.headers null check is needed. Double-checked the original code: It sent error with "Incompatible shuffle request version" message either if: 1. no request.headers() was present (equals to null) 2. shuffleversion is not equal to ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION 3. header name is not equal to ShuffleHeader.HTTP_HEADER_NAME > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, >
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631630#comment-17631630 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019051278 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -291,36 +302,86 @@ public void operationComplete(ChannelFuture future) throws Exception { } } + static class NettyChannelHelper { +static ChannelFuture writeToChannel(Channel ch, Object obj) { + LOG.debug("Writing {} to channel: {}", obj.getClass().getSimpleName(), ch.id()); + return ch.writeAndFlush(obj); +} + +static ChannelFuture writeToChannelAndClose(Channel ch, Object obj) { + return writeToChannel(ch, obj).addListener(ChannelFutureListener.CLOSE); +} + +static ChannelFuture writeToChannelAndAddLastHttpContent(Channel ch, HttpResponse obj) { + writeToChannel(ch, obj); + return writeLastHttpContentToChannel(ch); +} + +static ChannelFuture writeLastHttpContentToChannel(Channel ch) { + LOG.debug("Writing LastHttpContent, channel id: {}", ch.id()); + return ch.writeAndFlush(LastHttpContent.EMPTY_LAST_CONTENT); +} + +static ChannelFuture closeChannel(Channel ch) { + LOG.debug("Closing channel, channel id: {}", ch.id()); + return ch.close(); +} + +static void closeChannels(ChannelGroup channelGroup) { + channelGroup.close().awaitUninterruptibly(10, TimeUnit.SECONDS); +} + +public static ChannelFuture closeAsIdle(Channel channel, int timeout) { + LOG.debug("Closing channel as writer was idle for {} seconds", timeout); + return closeChannel(channel); +} + +public static void channelActive(Channel ch) { + LOG.debug("Executing channelActive, channel id: {}", ch.id()); +} + +public static void channelInactive(Channel channel) { + LOG.debug("Executing channelInactive, channel id: {}", channel.id()); +} Review Comment: This is just using the same name as framework method called 'org.apache.hadoop.mapred.ShuffleHandler.Shuffle#channelActive / io.netty.channel.ChannelInboundHandlerAdapter#channelActive'. So I would keep the name as it is. channel renamed to ch as you suggested. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631628#comment-17631628 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019049338 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService { public static final HttpResponseStatus TOO_MANY_REQ_STATUS = new HttpResponseStatus(429, "TOO MANY REQUESTS"); - // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT + // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT public static final long FETCH_RETRY_DELAY = 1000L; public static final String RETRY_AFTER_HEADER = "Retry-After"; + static final String ENCODER_HANDLER_NAME = "encoder"; private int port; - private ChannelFactory selector; - private final ChannelGroup accepted = new DefaultChannelGroup(); + private EventLoopGroup bossGroup; + private EventLoopGroup workerGroup; + private ServerBootstrap bootstrap; + private Channel ch; + private final ChannelGroup accepted = + new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next()); Review Comment: TBH, I don't know what's the right number for this as the previous implementation hidden the # of threads from the clients. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631627#comment-17631627 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019048887 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService { public static final HttpResponseStatus TOO_MANY_REQ_STATUS = new HttpResponseStatus(429, "TOO MANY REQUESTS"); - // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT + // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT public static final long FETCH_RETRY_DELAY = 1000L; public static final String RETRY_AFTER_HEADER = "Retry-After"; + static final String ENCODER_HANDLER_NAME = "encoder"; private int port; - private ChannelFactory selector; - private final ChannelGroup accepted = new DefaultChannelGroup(); + private EventLoopGroup bossGroup; + private EventLoopGroup workerGroup; + private ServerBootstrap bootstrap; + private Channel ch; + private final ChannelGroup accepted = + new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next()); Review Comment: Yep, this is the number of threads, it can be checked easily with the parameters. I don't think there's any value to extract it to a constant as it's only being used once here. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631626#comment-17631626 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1019043683 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java: ## @@ -72,19 +73,21 @@ private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After"; protected final Reporter reporter; - private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, + @VisibleForTesting + public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, CONNECTION, WRONG_REDUCE} - - private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; + + @VisibleForTesting + public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; Review Comment: actually this is inconsistent in the whole class so I'd refrain to change the order of modifiers as I only changed private to public here. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631248#comment-17631248 ] ASF GitHub Bot commented on HADOOP-15327: - 9uapaw commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018349426 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java: ## @@ -72,19 +73,21 @@ private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After"; protected final Reporter reporter; - private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, + @VisibleForTesting + public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, CONNECTION, WRONG_REDUCE} - - private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; + + @VisibleForTesting + public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; Review Comment: I think lets keep it, it is not important. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631246#comment-17631246 ] ASF GitHub Bot commented on HADOOP-15327: - K0K0V0K commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018348504 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java: ## @@ -668,34 +1357,61 @@ protected ChannelFuture sendMapOutput(ChannelHandlerContext ctx, conns[i].connect(); } -//Ensure first connections are okay -conns[0].getInputStream(); -int rc = conns[0].getResponseCode(); -Assert.assertEquals(HttpURLConnection.HTTP_OK, rc); - -conns[1].getInputStream(); -rc = conns[1].getResponseCode(); -Assert.assertEquals(HttpURLConnection.HTTP_OK, rc); - -// This connection should be closed because it to above the limit -try { - rc = conns[2].getResponseCode(); - Assert.assertEquals("Expected a too-many-requests response code", - ShuffleHandler.TOO_MANY_REQ_STATUS.getCode(), rc); - long backoff = Long.valueOf( - conns[2].getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER)); - Assert.assertTrue("The backoff value cannot be negative.", backoff > 0); - conns[2].getInputStream(); - Assert.fail("Expected an IOException"); -} catch (IOException ioe) { - LOG.info("Expected - connection should not be open"); -} catch (NumberFormatException ne) { - Assert.fail("Expected a numerical value for RETRY_AFTER header field"); -} catch (Exception e) { - Assert.fail("Expected a IOException"); +Map> mapOfConnections = Maps.newHashMap(); +for (HttpURLConnection conn : conns) { + try { +conn.getInputStream(); + } catch (IOException ioe) { +LOG.info("Expected - connection should not be open"); + } catch (NumberFormatException ne) { +fail("Expected a numerical value for RETRY_AFTER header field"); + } catch (Exception e) { +fail("Expected a IOException"); + } + int statusCode = conn.getResponseCode(); + LOG.debug("Connection status code: {}", statusCode); + mapOfConnections.putIfAbsent(statusCode, new ArrayList<>()); + List connectionList = mapOfConnections.get(statusCode); + connectionList.add(conn); } + +assertEquals(String.format("Expected only %s and %s response", +OK_STATUS, ShuffleHandler.TOO_MANY_REQ_STATUS), +Sets.newHashSet( +HttpURLConnection.HTTP_OK, +ShuffleHandler.TOO_MANY_REQ_STATUS.code()), +mapOfConnections.keySet()); -shuffleHandler.stop(); +List successfulConnections = +mapOfConnections.get(HttpURLConnection.HTTP_OK); +assertEquals(String.format("Expected exactly %d requests " + +"with %s response", maxAllowedConnections, OK_STATUS), +maxAllowedConnections, successfulConnections.size()); + +//Ensure exactly one connection is HTTP 429 (TOO MANY REQUESTS) +List closedConnections = +mapOfConnections.get(ShuffleHandler.TOO_MANY_REQ_STATUS.code()); +assertEquals(String.format("Expected exactly %d %s response", +notAcceptedConnections, ShuffleHandler.TOO_MANY_REQ_STATUS), +notAcceptedConnections, closedConnections.size()); + +// This connection should be closed because it is above the maximum limit +HttpURLConnection conn = closedConnections.get(0); +assertEquals(String.format("Expected a %s response", +ShuffleHandler.TOO_MANY_REQ_STATUS), +ShuffleHandler.TOO_MANY_REQ_STATUS.code(), conn.getResponseCode()); +long backoff = Long.parseLong( +conn.getHeaderField(ShuffleHandler.RETRY_AFTER_HEADER)); +assertTrue("The backoff value cannot be negative.", backoff > 0); + +shuffleHandler.stop(); + +//It's okay to get a ClosedChannelException. +//All other kinds of exceptions means something went wrong +assertEquals("Should have no caught exceptions", +Collections.emptyList(), failures.stream() +.filter(f -> !(f instanceof ClosedChannelException)) +.collect(toList())); Review Comment: So on second try: Maybe here we can call a close method on the elements of the http connection lists, cause some of them will be still open even if this test is finished > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch,
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631247#comment-17631247 ] ASF GitHub Bot commented on HADOOP-15327: - 9uapaw commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018349080 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService { public static final HttpResponseStatus TOO_MANY_REQ_STATUS = new HttpResponseStatus(429, "TOO MANY REQUESTS"); - // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT + // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT public static final long FETCH_RETRY_DELAY = 1000L; public static final String RETRY_AFTER_HEADER = "Retry-After"; + static final String ENCODER_HANDLER_NAME = "encoder"; private int port; - private ChannelFactory selector; - private final ChannelGroup accepted = new DefaultChannelGroup(); + private EventLoopGroup bossGroup; + private EventLoopGroup workerGroup; + private ServerBootstrap bootstrap; + private Channel ch; + private final ChannelGroup accepted = + new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next()); Review Comment: This is just the number of threads the executor will have. Though it might be extracted for clear intentions. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631245#comment-17631245 ] ASF GitHub Bot commented on HADOOP-15327: - 9uapaw commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018347131 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -291,36 +302,86 @@ public void operationComplete(ChannelFuture future) throws Exception { } } + static class NettyChannelHelper { +static ChannelFuture writeToChannel(Channel ch, Object obj) { + LOG.debug("Writing {} to channel: {}", obj.getClass().getSimpleName(), ch.id()); + return ch.writeAndFlush(obj); +} + +static ChannelFuture writeToChannelAndClose(Channel ch, Object obj) { + return writeToChannel(ch, obj).addListener(ChannelFutureListener.CLOSE); +} + +static ChannelFuture writeToChannelAndAddLastHttpContent(Channel ch, HttpResponse obj) { + writeToChannel(ch, obj); + return writeLastHttpContentToChannel(ch); +} + +static ChannelFuture writeLastHttpContentToChannel(Channel ch) { + LOG.debug("Writing LastHttpContent, channel id: {}", ch.id()); + return ch.writeAndFlush(LastHttpContent.EMPTY_LAST_CONTENT); +} + +static ChannelFuture closeChannel(Channel ch) { + LOG.debug("Closing channel, channel id: {}", ch.id()); + return ch.close(); +} + +static void closeChannels(ChannelGroup channelGroup) { + channelGroup.close().awaitUninterruptibly(10, TimeUnit.SECONDS); +} + +public static ChannelFuture closeAsIdle(Channel channel, int timeout) { + LOG.debug("Closing channel as writer was idle for {} seconds", timeout); + return closeChannel(channel); +} + +public static void channelActive(Channel ch) { + LOG.debug("Executing channelActive, channel id: {}", ch.id()); +} + +public static void channelInactive(Channel channel) { + LOG.debug("Executing channelInactive, channel id: {}", channel.id()); +} Review Comment: We need an other patch anyway, so we might as well fix these to be more consistent. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631244#comment-17631244 ] ASF GitHub Bot commented on HADOOP-15327: - 9uapaw commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018346414 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -904,65 +990,84 @@ private List splitMaps(List mapq) { } @Override -public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) +public void channelActive(ChannelHandlerContext ctx) throws Exception { - super.channelOpen(ctx, evt); - - if ((maxShuffleConnections > 0) && (accepted.size() >= maxShuffleConnections)) { + NettyChannelHelper.channelActive(ctx.channel()); + int numConnections = activeConnections.incrementAndGet(); + if ((maxShuffleConnections > 0) && (numConnections > maxShuffleConnections)) { LOG.info(String.format("Current number of shuffle connections (%d) is " + -"greater than or equal to the max allowed shuffle connections (%d)", +"greater than the max allowed shuffle connections (%d)", accepted.size(), maxShuffleConnections)); -Map headers = new HashMap(1); +Map headers = new HashMap<>(1); // notify fetchers to backoff for a while before closing the connection // if the shuffle connection limit is hit. Fetchers are expected to // handle this notification gracefully, that is, not treating this as a // fetch failure. headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY)); sendError(ctx, "", TOO_MANY_REQ_STATUS, headers); -return; + } else { +super.channelActive(ctx); +accepted.add(ctx.channel()); +LOG.debug("Added channel: {}, channel id: {}. Accepted number of connections={}", +ctx.channel(), ctx.channel().id(), activeConnections.get()); } - accepted.add(evt.getChannel()); } @Override -public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt) +public void channelInactive(ChannelHandlerContext ctx) throws Exception { + NettyChannelHelper.channelInactive(ctx.channel()); + super.channelInactive(ctx); + int noOfConnections = activeConnections.decrementAndGet(); + LOG.debug("New value of Accepted number of connections={}", noOfConnections); +} + +@Override +public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception { - HttpRequest request = (HttpRequest) evt.getMessage(); - if (request.getMethod() != GET) { - sendError(ctx, METHOD_NOT_ALLOWED); - return; + Channel channel = ctx.channel(); + LOG.trace("Executing channelRead, channel id: {}", channel.id()); + HttpRequest request = (HttpRequest) msg; + LOG.debug("Received HTTP request: {}, channel id: {}", request, channel.id()); + if (request.method() != GET) { +sendError(ctx, METHOD_NOT_ALLOWED); +return; } // Check whether the shuffle version is compatible - if (!ShuffleHeader.DEFAULT_HTTP_HEADER_NAME.equals( - request.headers() != null ? - request.headers().get(ShuffleHeader.HTTP_HEADER_NAME) : null) - || !ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION.equals( - request.headers() != null ? - request.headers() - .get(ShuffleHeader.HTTP_HEADER_VERSION) : null)) { + String shuffleVersion = ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION; + String httpHeaderName = ShuffleHeader.HTTP_HEADER_NAME; Review Comment: This is a valid concern, please fix it. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail:
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631229#comment-17631229 ] ASF GitHub Bot commented on HADOOP-15327: - 9uapaw commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018309405 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/resources/log4j.properties: ## @@ -17,3 +17,5 @@ log4j.threshold=ALL log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p [%t] %c{2} (%F:%M(%L)) - %m%n +log4j.logger.io.netty=DEBUG +log4j.logger.org.apache.hadoop.mapred=DEBUG Review Comment: +1 this is left here accidentally. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > hades-results-20221108.zip, testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631196#comment-17631196 ] ASF GitHub Bot commented on HADOOP-15327: - brumi1024 commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018265618 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -904,65 +990,84 @@ private List splitMaps(List mapq) { } @Override -public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) +public void channelActive(ChannelHandlerContext ctx) throws Exception { - super.channelOpen(ctx, evt); - - if ((maxShuffleConnections > 0) && (accepted.size() >= maxShuffleConnections)) { + NettyChannelHelper.channelActive(ctx.channel()); + int numConnections = activeConnections.incrementAndGet(); + if ((maxShuffleConnections > 0) && (numConnections > maxShuffleConnections)) { LOG.info(String.format("Current number of shuffle connections (%d) is " + -"greater than or equal to the max allowed shuffle connections (%d)", +"greater than the max allowed shuffle connections (%d)", accepted.size(), maxShuffleConnections)); -Map headers = new HashMap(1); +Map headers = new HashMap<>(1); // notify fetchers to backoff for a while before closing the connection // if the shuffle connection limit is hit. Fetchers are expected to // handle this notification gracefully, that is, not treating this as a // fetch failure. headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY)); sendError(ctx, "", TOO_MANY_REQ_STATUS, headers); -return; + } else { +super.channelActive(ctx); +accepted.add(ctx.channel()); +LOG.debug("Added channel: {}, channel id: {}. Accepted number of connections={}", +ctx.channel(), ctx.channel().id(), activeConnections.get()); } - accepted.add(evt.getChannel()); } @Override -public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt) +public void channelInactive(ChannelHandlerContext ctx) throws Exception { + NettyChannelHelper.channelInactive(ctx.channel()); + super.channelInactive(ctx); + int noOfConnections = activeConnections.decrementAndGet(); + LOG.debug("New value of Accepted number of connections={}", noOfConnections); +} + +@Override +public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception { - HttpRequest request = (HttpRequest) evt.getMessage(); - if (request.getMethod() != GET) { - sendError(ctx, METHOD_NOT_ALLOWED); - return; + Channel channel = ctx.channel(); + LOG.trace("Executing channelRead, channel id: {}", channel.id()); + HttpRequest request = (HttpRequest) msg; + LOG.debug("Received HTTP request: {}, channel id: {}", request, channel.id()); + if (request.method() != GET) { +sendError(ctx, METHOD_NOT_ALLOWED); +return; } // Check whether the shuffle version is compatible - if (!ShuffleHeader.DEFAULT_HTTP_HEADER_NAME.equals( - request.headers() != null ? - request.headers().get(ShuffleHeader.HTTP_HEADER_NAME) : null) - || !ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION.equals( - request.headers() != null ? - request.headers() - .get(ShuffleHeader.HTTP_HEADER_VERSION) : null)) { + String shuffleVersion = ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION; + String httpHeaderName = ShuffleHeader.HTTP_HEADER_NAME; Review Comment: +1 on the DEFAULT, however the if is needed because if the request.headers() returns a non-null value we need to check the version. ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java: ## @@ -72,19 +73,21 @@ private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After"; protected final Reporter reporter; - private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, + @VisibleForTesting + public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, CONNECTION, WRONG_REDUCE} - - private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; + + @VisibleForTesting + public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; Review Comment: Changing this would mean that the others should be changed as well, and that could complicate the diff. Not sure if it's worth it. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key:
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631149#comment-17631149 ] ASF GitHub Bot commented on HADOOP-15327: - K0K0V0K commented on code in PR #3259: URL: https://github.com/apache/hadoop/pull/3259#discussion_r1018076397 ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -182,19 +184,29 @@ public class ShuffleHandler extends AuxiliaryService { public static final HttpResponseStatus TOO_MANY_REQ_STATUS = new HttpResponseStatus(429, "TOO MANY REQUESTS"); - // This should kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT + // This should be kept in sync with Fetcher.FETCH_RETRY_DELAY_DEFAULT public static final long FETCH_RETRY_DELAY = 1000L; public static final String RETRY_AFTER_HEADER = "Retry-After"; + static final String ENCODER_HANDLER_NAME = "encoder"; private int port; - private ChannelFactory selector; - private final ChannelGroup accepted = new DefaultChannelGroup(); + private EventLoopGroup bossGroup; + private EventLoopGroup workerGroup; + private ServerBootstrap bootstrap; + private Channel ch; + private final ChannelGroup accepted = + new DefaultChannelGroup(new DefaultEventExecutorGroup(5).next()); Review Comment: Maybe there can be a line comment why we have to create 5 event executor ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java: ## @@ -72,19 +73,21 @@ private static final String FETCH_RETRY_AFTER_HEADER = "Retry-After"; protected final Reporter reporter; - private enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, + @VisibleForTesting + public enum ShuffleErrors{IO_ERROR, WRONG_LENGTH, BAD_ID, WRONG_MAP, CONNECTION, WRONG_REDUCE} - - private final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; + + @VisibleForTesting + public final static String SHUFFLE_ERR_GRP_NAME = "Shuffle Errors"; Review Comment: check style wont cry for public static final instead of public final static? ## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java: ## @@ -904,65 +990,84 @@ private List splitMaps(List mapq) { } @Override -public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) +public void channelActive(ChannelHandlerContext ctx) throws Exception { - super.channelOpen(ctx, evt); - - if ((maxShuffleConnections > 0) && (accepted.size() >= maxShuffleConnections)) { + NettyChannelHelper.channelActive(ctx.channel()); + int numConnections = activeConnections.incrementAndGet(); + if ((maxShuffleConnections > 0) && (numConnections > maxShuffleConnections)) { LOG.info(String.format("Current number of shuffle connections (%d) is " + -"greater than or equal to the max allowed shuffle connections (%d)", +"greater than the max allowed shuffle connections (%d)", accepted.size(), maxShuffleConnections)); -Map headers = new HashMap(1); +Map headers = new HashMap<>(1); // notify fetchers to backoff for a while before closing the connection // if the shuffle connection limit is hit. Fetchers are expected to // handle this notification gracefully, that is, not treating this as a // fetch failure. headers.put(RETRY_AFTER_HEADER, String.valueOf(FETCH_RETRY_DELAY)); sendError(ctx, "", TOO_MANY_REQ_STATUS, headers); -return; + } else { +super.channelActive(ctx); +accepted.add(ctx.channel()); +LOG.debug("Added channel: {}, channel id: {}. Accepted number of connections={}", +ctx.channel(), ctx.channel().id(), activeConnections.get()); } - accepted.add(evt.getChannel()); } @Override -public void messageReceived(ChannelHandlerContext ctx, MessageEvent evt) +public void channelInactive(ChannelHandlerContext ctx) throws Exception { + NettyChannelHelper.channelInactive(ctx.channel()); + super.channelInactive(ctx); + int noOfConnections = activeConnections.decrementAndGet(); + LOG.debug("New value of Accepted number of connections={}", noOfConnections); +} + +@Override +public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception { - HttpRequest request = (HttpRequest) evt.getMessage(); - if (request.getMethod() != GET) { - sendError(ctx, METHOD_NOT_ALLOWED); - return; + Channel channel = ctx.channel(); + LOG.trace("Executing channelRead, channel id: {}", channel.id()); +
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631029#comment-17631029 ] Szilard Nemeth commented on HADOOP-15327: - Hi, CC: [~gandras], [~shuzirra], [~weichiu] Let me summarize what kind of testing I performed to make sure this change won't cause any regression. The project that helped me very much with the testing is called [Hades|https://github.com/9uapaw/hades]. Kudos to [~gandras] for the initial work on the Hades project. h1. TL;DR *Hades was the framework I used to run my testcases.* *All testcases are passed both with the trunk version of Hadoop (this is not surprising at all) and the deployed Hadoop version with my Netty upgrade patch.* *See the attached test logs for details.* *Also see the details below about what Hades is, how I tested, why I chose certain configurations for the testcases and many more..* *Now I'm pretty confident that this patch won't break anything so I'm waiting for reviewers.* h1. HADES IN GENERAL h2. What is Hades? Hades is a CLI tool, that shares a common interface between various Hadoop distributions. It is a collection of commands most frequently used by developers of Hadoop components. Hades supports [Hadock|https://github.com/9uapaw/docker-hadoop-dev], [Cloduera Data Platform|https://www.cloudera.com/products/cloudera-data-platform.html] and standard upstream distribution. h2. Basic features of Hades - Discover cluster: Stores where individual YARN / HDFS daemons are running. - Distribute files on certain nodes - Get config: Prints configuration of selected roles - Read logs of Hadoop roles - Restart: Restarting of certain roles - Run an application on the defined cluster - Status: Prints the status of the cluster - Update config: Update properties on a config file for selected roles - YARN specific commands - Run script: Runs user-defined custom scripts against the cluster. h1. CLUSTER + HADES SETUP h2. Run Hades with the Netty testing script against a cluster First of all, I created a standard cluster and deployed Hadoop to the cluster. Side note: Later on, all the installation that deploys Hadoop on the cluster could be part of Hades as well. It's worth to be mentioned that I have a [PR with netty-related changes|https://github.com/9uapaw/hades/pull/6] against the Hades repo. The branch of this PR is [this|https://github.com/szilard-nemeth/hades/tree/netty4-finish]. [Here are the instructions|https://github.com/szilard-nemeth/hades/blob/c16e95393ecf3e787e125c58d88ec2dc6a44b9e0/README.md#set-up-hades-on-a-cluster-and-run-the-netty-script] for how to set up and run Hades with the Netty testing script. h1. THE NETTY TESTING SCRIPT The Netty testing script [lives here|https://github.com/szilard-nemeth/hades/blob/netty4-finish/script/netty4.py]. As you can see on the code, quite a lot of work has been done to make sure the Netty 4 upgrade won't break anything and won't cause any regression as it is a crucial part of MapReduce. h2. CONCEPTS h3. Test context Class: Netty4TestContext The test context provides a way to encapsulate a base branch and a patch file (if any) applied on top of the base branch. The context can enable or disable Maven compilation. The context can also have certain ways to ensure that the compilation and the deployment of new jars were successful on the cluster. Now, it can verify that certain logs are appearing in the daemon logs, making sure the deployment was okay. The main purpose of the context is to compare it with results of other contexts. For the Netty testing, it was evident that I need to make sure the trunk version and my version with the patch applied on top of trnuk works the same, e.g. there's no regression. For this, I created the context. h3. Testcase Class: Netty4Testcase In general, a testcase can have a name, a simple name, some config changes (dictionary of string keys, string values) and one MR application. h3. Test config: Config options for running the tests Class: Netty4TestConfig These are the main config options for the Netty testing. I won't go into too much details as I defined a ton of options along the way. You can check all the config options [here|https://github.com/szilard-nemeth/hades/blob/c16e95393ecf3e787e125c58d88ec2dc6a44b9e0/script/netty4.py#L655-L687] h3. Compiler As mentioned above, Hades can compile Hadoop with Maven and replace the changed jars / Maven modules on the cluster. This is particularly useful for the Netty testing as I was interested in whether the patch causes any issues so I had to compile Hadoop with my Netty patch, deploy the jars on the cluster and run all the tests and see all of them passing. h2. TESTCASES The testcases are defined with the help of the Netty4TestcasesBuilder. You can find all the testcases
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17629226#comment-17629226 ] ASF GitHub Bot commented on HADOOP-15327: - hadoop-yetus commented on PR #3259: URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1304333241 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 3s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 35s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 0s | | trunk passed | | +1 :green_heart: | compile | 23m 25s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 20m 57s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 5s | | trunk passed | | +1 :green_heart: | mvnsite | 7m 1s | | trunk passed | | +1 :green_heart: | javadoc | 5m 32s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 5m 49s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +0 :ok: | spotbugs | 0m 51s | | branch/hadoop-client-modules/hadoop-client-runtime no spotbugs output file (spotbugsXml.xml) | | +1 :green_heart: | shadedclient | 21m 14s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 6m 25s | | the patch passed | | +1 :green_heart: | compile | 22m 38s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 22m 38s | | the patch passed | | +1 :green_heart: | compile | 20m 58s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 20m 58s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/blanks-eol.txt) | The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 3m 58s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/results-checkstyle-root.txt) | root: The patch generated 21 new + 82 unchanged - 62 fixed = 103 total (was 144) | | +1 :green_heart: | mvnsite | 6m 50s | | the patch passed | | +1 :green_heart: | javadoc | 5m 42s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 5m 34s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 4m 41s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client.html) | hadoop-mapreduce-project/hadoop-mapreduce-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | -1 :x: | spotbugs | 1m 47s | [/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/1/artifact/out/new-spotbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html) | hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +0 :ok: | spotbugs | 1m 6s | | hadoop-client-modules/hadoop-client-runtime has no data from spotbugs | | +1 :green_heart: | shadedclient | 21m 25s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 167m 40s | | hadoop-mapreduce-client in the patch passed. | | +1 :green_heart: | unit | 7m 41s | | hadoop-mapreduce-client-core in the patch
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17629001#comment-17629001 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth opened a new pull request, #3259: URL: https://github.com/apache/hadoop/pull/3259 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628985#comment-17628985 ] ASF GitHub Bot commented on HADOOP-15327: - szilard-nemeth closed pull request #3259: HADOOP-15327. Upgrade MR ShuffleHandler to use Netty4 URL: https://github.com/apache/hadoop/pull/3259 > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600654#comment-17600654 ] ASF GitHub Bot commented on HADOOP-15327: - ashutoshcipher commented on PR #3259: URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1237792070 Thanks @szilard-nemeth for working on this. Looks good to me. Looking forward for the final patch. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582976#comment-17582976 ] ASF GitHub Bot commented on HADOOP-15327: - 9uapaw commented on PR #3259: URL: https://github.com/apache/hadoop/pull/3259#issuecomment-1222360570 Latest change looks good to me, the shading issue is gone now. Once spotbugs and checkstyle errors are fixed, we are good to go. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 11.5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436592#comment-17436592 ] Jason Wen commented on HADOOP-15327: Hi [~snemeth], I can help to backport the fix to 3.2.x branch to see if it has same maven shading issue. If same issue exists I can help to fix the shading issue. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 6h 50m > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17431113#comment-17431113 ] Ayush Pal commented on HADOOP-15327: Hi [~snemeth] , I referred PR provided in this jira for netty upgrade for fixing Netty CVE vulnerability and did some fixes for maven shading issues. I am happy to contribute the patch to open source. let me know your thoughts. Thanks > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 6h 10m > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430075#comment-17430075 ] Szilard Nemeth commented on HADOOP-15327: - Hi [~zhenshan.wen] , I'm planning to fix the maven shading issue in the coming weeks, as soon as possible. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 5h 40m > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425285#comment-17425285 ] Jason Wen commented on HADOOP-15327: What's the current status of the PR? Will the fix be included in any upcoming release? > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 5h > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17393023#comment-17393023 ] Szilard Nemeth commented on HADOOP-15327: - Converted to a PR, no more patches will be uploaded to this jira. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, HADOOP-15327.005.patch, > HADOOP-15327.005.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log, > testfailure-testMapFileAccess-emptyresponse.zip, > testfailure-testReduceFromPartialMem.zip > > Time Spent: 20m > Remaining Estimate: 0h > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387909#comment-17387909 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 41s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 13s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 17s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 30s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 25m 48s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 10s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 40m 45s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 28s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 36s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 13s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 25m 13s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/216/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 1912 unchanged - 0 fixed = 1913 total (was 1912) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 33s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m 33s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/216/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 1790 unchanged - 0 fixed = 1791 total (was 1790) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387643#comment-17387643 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 27m 2s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 16s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 35m 52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 49s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 10s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 31m 18s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 21s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 50m 54s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 42s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 12s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 23s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 25m 23s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/215/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 1912 unchanged - 0 fixed = 1913 total (was 1912) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 34s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m 34s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/215/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 1790 unchanged - 0 fixed = 1791 total (was 1790) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371468#comment-17371468 ] Szilard Nemeth commented on HADOOP-15327: - Just uploaded a new patch: [^HADOOP-15327.005.patch] I have been (almost) exclusively working on this since my last comment and there are a couple of things to add again. The last commit that was discussed is this: [https://github.com/szilard-nemeth/hadoop/commit/f149be8de28baafc64eed1c47e788f5beb215e62] Let me explain what've changed commit by commit. I will skip a bunch of trivial ones like code cleanup, added comments and the like. *I will cover the test failures surfaced by Jenkins build / unit test results:* - Build #1 - Build #2 *1. TestShuffleHandler: Introduced InputStreamReadResult that stores response as string + total bytes read: [https://github.com/szilard-nemeth/hadoop/commit/a57de573c97fe12c9071dd3450df8f450bf075ea]* Here, I added a new class called 'InputStreamReadResult' that stores the bytes read (byte[]) and the number of bytes read from a response InputStream. This improves the way testcases can assert on these data. *2. TestShuffleHandler: Use DEFAULT_PORT for all shuffle handler port configs: [https://github.com/szilard-nemeth/hadoop/commit/78b1166866c85cab6860407f8fe4a4ddc3168fae]* It was a common pitfall while debugging that the tests had to modified to use a certain fixed port. Here, I added a constant to store the port number so when I had to debug I only needed to change it in one single place. *3. Create class: TestExecution: Configure proxy, keep alive connection timeout: [https://github.com/szilard-nemeth/hadoop/commit/fa5bb32ae4eb737077a165b3b1fba5069c982243]* In order to debug the HTTP responses, I found it convenient to add a helper class that is responsible for the following: - Configuring the HTTP connections, use a proxy when required - Increase the keepalive timeout when using DEBUG mode TEST_EXECUTION is a static instance of TestExecution, initialized with a JUnit test setup method. There are 2 flags that control the behaviour of this object: {code:java} //Control test execution properties with these flags private static final boolean DEBUG_MODE = true; //If this is set to true and proxy server is not running, tests will fail! private static final boolean USE_PROXY = false; {code} The only difference on top of these is in the code of testcases: They create all HTTP connections with: {code:java} TEST_EXECUTION.openConnection(url) {code} *4. TestExecution: Configure port: [https://github.com/szilard-nemeth/hadoop/commit/4a5c035695be1099bff4a633cd605b9f8146d841]* One addition to 3. is to include the port used by ShuffleHandler in the TestExecution object. When using DEBUG mode, the port is fixed to a value, otherwise it is set to 0, meaning that the port will be dynamically chosen. *5. Add logging response encoder to TestShuffleHandler.testMapFileAccess: [https://github.com/szilard-nemeth/hadoop/commit/64686b47d2fed4e923c1c9c0169a06aba3e339be]* While debugging TestShuffleHandler#testMapFileAccess, just realized that I forgot to add the LoggingHttpResponseEncoder to the pipeline. The most trivial way was to modify the pipeline when the channel is activated. *6. TestShuffleHandler.testMapFileAccess: Modify to be able to run it locally + reproduce jenkins UT failure: [https://github.com/szilard-nemeth/hadoop/commit/bb0fcbbd7dcbe3fa7efd1b6a8c2eb8a9055c5ecd]* Here's where the fun begins. The problem with TestShuffleHandler#testMapFileAccess is that it requires the NativeIO module: {code:java} // This will run only in NativeIO is enabled as SecureIOUtils need it assumeTrue(NativeIO.isAvailable()); {code} I tried to compile the Hadoop Native libraries on my Mac according to these resources: - Native libraries: [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html] - Followed this guide: [https://dev.to/zejnilovic/building-hadoop-native-libraries-on-mac-in-2019-1iee] Unfortunately, I still had compilation errors so I eventually gave up and tweaked the test to be able to run it locally. This wasn't such a complex thing, I don't think it's worth to go into the details, had to comment out some test code that used the Native library and that was all. From the Jenkins results I had this: {code:java} [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ hadoop-mapreduce-client-shuffle --- [INFO] [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.mapred.TestFadvisedFileRegion [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.493 s - in org.apache.hadoop.mapred.TestFadvisedFileRegion [INFO] Running org.apache.hadoop.mapred.TestShuffleHandler [ERROR] Tests run: 15, Failures: 1, Errors: 0,
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367912#comment-17367912 ] Szilard Nemeth commented on HADOOP-15327: - Hey [~weichiu], Thanks for putting the exceprt here. This could be fixed in parallel, I would be glad if you could point me to the config that needs to be changed. Currently, I'm working on the test issues produced by the build that ran against patch003: hadoop.mapred.TestReduceFetchFromPartialMem hadoop.mapred.TestReduceFetch There are jiras related to these tests but checked the logs and saw very suspicious things and it pointed me to a code defect. I will upload a next patch soon along with explanation of what has been changed since patch004. Hopefully, this can be the last one and I can finally start testing on a cluster. I wouldn't expect any production issues (fingers crossed) as test coverage is quite good and while I have been fixing the tests, I gained a lot of code knowledge, almost being familiar with the ShuffleHandler inside and out. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log > > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367842#comment-17367842 ] Wei-Chiu Chuang commented on HADOOP-15327: -- This is an excerpt of the error: {noformat} Found in: org.apache.hadoop:hadoop-client-minicluster:jar:3.4.0-SNAPSHOT:compile org.apache.hadoop:hadoop-client-runtime:jar:3.4.0-SNAPSHOT:compile Duplicate classes: org/apache/hadoop/shaded/io/netty/handler/codec/mqtt/MqttSubscriptionOption$RetainedHandlingPolicy.class org/apache/hadoop/shaded/io/netty/buffer/ByteBufProcessor$3.class org/apache/hadoop/shaded/io/netty/handler/codec/http2/Http2Headers$PseudoHeaderName.class org/apache/hadoop/shaded/io/netty/handler/codec/http2/StreamBufferingEncoder$Http2GoAwayException.class org/apache/hadoop/shaded/io/netty/channel/epoll/AbstractEpollChannel$AbstractEpollUnsafe$3.class org/apache/hadoop/shaded/io/netty/handler/codec/spdy/SpdySession.class org/apache/hadoop/shaded/io/netty/channel/udt/UdtServerChannelConfig.class org/apache/hadoop/shaded/io/netty/handler/codec/http2/Http2Flags.class {noformat} > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, HADOOP-15327.004.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log > > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366722#comment-17366722 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 29s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 51s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 50s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 49s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 15s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 45s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 12s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 29s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 12s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 59s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 59s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 12s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 12s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 56s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/203/artifact/out/diff-checkstyle-root.txt{color} | {color:orange} root: The patch generated 116 new + 83 unchanged - 7 fixed = 199 total (was 90) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 53s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363928#comment-17363928 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 28s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 41s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 34s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 7s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 23m 42s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 36m 7s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 35s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 44s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 47s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 47s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 44s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m 44s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 6m 1s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/200/artifact/out/diff-checkstyle-root.txt{color} | {color:orange} root: The patch generated 116 new + 83 unchanged - 7 fixed = 199 total (was 90) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 26s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362456#comment-17362456 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 25s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 18s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 15s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 2s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 58s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 43s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 17s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 28s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 11s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 57s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 57s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 19s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 19s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 57s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/199/artifact/out/diff-checkstyle-root.txt{color} | {color:orange} root: The patch generated 93 new + 83 unchanged - 7 fixed = 176 total (was 90) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362368#comment-17362368 ] Szilard Nemeth commented on HADOOP-15327: - *Remaining TODO items that I can make progress with:* - Fix failing unit tests - Testing on cluster > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Attachments: HADOOP-15327.001.patch, HADOOP-15327.002.patch, > HADOOP-15327.003.patch, > getMapOutputInfo_BlockingOperationException_awaitUninterruptibly.log > > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362367#comment-17362367 ] Szilard Nemeth commented on HADOOP-15327: - The latest patch contains commits from this branch: [https://github.com/szilard-nemeth/hadoop/commits/HADOOP-15327-snemeth] There are a couple of commits so I would approach this by explaning the reasons behind each change in the commits. Not all commits are listed, I left out a few trivial ones. Unfortunately, this task was a bit tricky as everytime I touched something in the test, I just found another bug or weird behaviour so it took a great deal of time to solve and discover everything. *1. ShuffleHandler: ch.isOpen() --> ch.isActive(): [https://github.com/szilard-nemeth/hadoop/commit/e703adb57f66da8579baa26257ca9aaed2bf1db5]* This was already mentioned with my previous lenghtier comment. *2. TestShuffleHandler: Fix mocking in testSendMapCount + replace ch.write() with ch.writeAndFlush(): [https://github.com/szilard-nemeth/hadoop/commit/07fbfee5cae85e8e374b53c303e794c19c620efc]* This is about 2 things: - Replacing channel.write calls with channel.writeAndFlush - Fixing bad mocking in org.apache.hadoop.mapred.TestShuffleHandler#testSendMapCount *3. TestShuffleHandler.testMaxConnections: Rewrite test + production code: accepted connection handling: [https://github.com/szilard-nemeth/hadoop/commit/def0059982ef8f0e2f19d385b1a1fcdca8639f9d]* *Changes in production code:* - ShuffleHandler#channelActive added the channel to the channel group (field called 'accepted') before the if statement that enforces the maximum number of open connections. This was the old, wrong piece of code: {code:java} super.channelActive(ctx); LOG.debug("accepted connections={}", accepted.size()); if ((maxShuffleConnections > 0) && (accepted.size() >= maxShuffleConnections)) { {code} - Also, counting the number of open channels with the channel group was unreliable so I introduced a new AtomicInteger field called 'acceptedConnections' to track the open channels / connections. - There was another issue: When the channels were accepted, the counter of open channels was increased but when channels were inactivated I could not see any code that would have maintained (decremented) the value. This was mitigated by adding org.apache.hadoop.mapred.ShuffleHandler.Shuffle#channelInactive that logs the channel inactivated event and decreases the open connections counter: {code:java} @Override public void channelInactive(ChannelHandlerContext ctx) throws Exception { super.channelInactive(ctx); acceptedConnections.decrementAndGet(); LOG.debug("New value of Accepted number of connections={}", acceptedConnections.get()); } {code} *Changes in test code:* - org.apache.hadoop.mapred.TestShuffleHandler#testMaxConnections: Fixed the testcase, the issue was pointed out correctly by [~weichiu] : The connections are accepted in parallel so we should not rely on their order in the test. The way I rewritten this is that I introduced a map to group HttpURLConnection objects by their HTTP response code. Then I check if we only have 200 OK and 429 TOO MANY REQUESTS, and check if the number of 200 OK connections is 2 and there's only one unaccepted connection. *4. increase netty version to 4.1.65.Final: [https://github.com/szilard-nemeth/hadoop/commit/4f4589063b579a93389b1e188c29bd895ae507fc]* This is a simple commit to increase the Netty version to the latest stable 4.x version. See this page: [https://netty.io/downloads.html] It states: "netty-4.1.65.Final.tar.gz ‐ 19-May-2021 (Stable, Recommended)" *5. ShuffleHandler: Fix keepalive test + writing HTTP response properly to channel: [https://github.com/szilard-nemeth/hadoop/commit/1aad4eaace28cfff4a9a9152f7535d70cc6e3734]* This is where things get more interesting. There was a testcase called org.apache.hadoop.mapred.TestShuffleHandler#testKeepAlive that caught an issue that came up because Netty 4.x handles HTTP responses written to the same channel differently than Netty 3.x. See details below. Production code changes: - Added some logs to be able to track what happened when utilizing HTTP Connection Keep-alive. - Added a ChannelOutboundHandlerAdapter that handles exceptions that happens during outbound message construction. This is by default not logged by Netty and I only found this trick to catch these events: {code:java} pipeline.addLast("outboundExcHandler", new ChannelOutboundHandlerAdapter() { @Override public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) throws Exception { promise.addListener(ChannelFutureListener.FIRE_EXCEPTION_ON_FAILURE); super.write(ctx, msg, promise); } }); {code} This solution is described here:
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362339#comment-17362339 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 22s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 32s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 13s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 1s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 59s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 51s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 22s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 31s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 6s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 22m 6s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 19s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 19s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 58s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/197/artifact/out/diff-checkstyle-root.txt{color} | {color:orange} root: The patch generated 95 new + 83 unchanged - 7 fixed = 178 total (was 90) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 57s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362225#comment-17362225 ] Hadoop QA commented on HADOOP-15327: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 4s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 46s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 3s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 34s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 17s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 56s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 56s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 38s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 32m 7s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 31s{color} | {color:blue}{color} | {color:blue} branch/hadoop-project no spotbugs output file (spotbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 11s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 3s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 22m 3s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/196/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 1981 unchanged - 0 fixed = 1982 total (was 1981) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 19s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 19m 19s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/196/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 1858 unchanged - 0 fixed = 1859 total (was 1858) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} |
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362029#comment-17362029 ] Szilard Nemeth commented on HADOOP-15327: - Thanks [~weichiu] for your help. Added a preliminary patch to kick-off Jenkins. Haven't touch shaded config so I'm expecting a Maven error from Jenkins. Referring back to your comment here: I'm quite a beginner with shading. Can you or anyone else help me out with this? *Remaining TODO items:* - Testing on cluster - Adding explanation comment for the new code changes. So a more lengthy comment will follow this :) > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > Attachments: HADOOP-15327.001.patch > > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359058#comment-17359058 ] Wei-Chiu Chuang commented on HADOOP-15327: -- Thanks... 1. Change category #5, closing channels with: future.channel().closeFuture().awaitUninterruptibly() Could you please tell me how did you find out to close a channel like this? As mentioned above, I couldn't find any resource to justify this. Channel.close() in netty4 takes a Future parameter. But In fact, this is a bad practice according to the official doc: https://netty.io/4.0/api/io/netty/channel/ChannelFuture.html "Do not call await() inside ChannelHandler" > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358861#comment-17358861 ] Szilard Nemeth commented on HADOOP-15327: - Let me list the differences introduced because of the migration from Netty 3.x to 4.x. There is a migration guide that mentions most (but not all) of the changes: [https://netty.io/wiki/new-and-noteworthy-in-4.0.html] Please note that the below code changes are based on Wei-Chiu's branch: [https://github.com/jojochuang/hadoop/commits/shuffle_handler_netty4] h2. CHANGES IN ShuffleHandler h3. *I will list the changes mostly from ShuffleHandler as it covers almost all type of changes in other classes as well.* *In TestShuffleHandler, the test code was changed by any of the justifications listed down below.* h3. Change category #1: General API changes / non-configuration getters: Details: [https://netty.io/wiki/new-and-noteworthy-in-4.0.html#general-api-changes] {quote}Non-configuration getters have no get- prefix anymore. (e.g. Channel.getRemoteAddress() → Channel.remoteAddress()) Boolean properties are still prefixed with is- to avoid confusion (e.g. 'empty' is both an adjective and a verb, so empty() can have two meanings.) {quote} I'm just listing all the changes without additional context (in which method they were changed) separated by three dots, as they are simply method renamings: {code:java} -future.getChannel().close(); +future.channel().closeFuture().awaitUninterruptibly(); ... ... - ChannelPipeline pipeline = future.getChannel().getPipeline(); + ChannelPipeline pipeline = future.channel().pipeline(); ... ... -port = ((InetSocketAddress)ch.getLocalAddress()).getPort(); +port = ((InetSocketAddress)ch.localAddress()).getPort(); ... ... - if (e.getState() == IdleState.WRITER_IDLE && enabledTimeout) { -e.getChannel().close(); + if (e.state() == IdleState.WRITER_IDLE && enabledTimeout) { +ctx.channel().close(); ... ... - accepted.add(evt.getChannel()); + accepted.add(ctx.channel()); ... ... -new QueryStringDecoder(request.getUri()).getParameters(); +new QueryStringDecoder(request.getUri()).parameters(); //getUri was not changed, see this later ... ... - Channel ch = evt.getChannel(); - ChannelPipeline pipeline = ch.getPipeline(); + Channel ch = ctx.channel(); + ChannelPipeline pipeline = ch.pipeline(); ... ... - reduceContext.getCtx().getChannel(), + reduceContext.getCtx().channel(), ... ... - if (ch.getPipeline().get(SslHandler.class) == null) { + if (ch.pipeline().get(SslHandler.class) == null) { ... ... - Channel ch = evt.getChannel(); - ChannelPipeline pipeline = ch.getPipeline(); + Channel ch = ctx.channel(); + ChannelPipeline pipeline = ch.pipeline(); ... ... - ctx.getChannel().write(response).addListener(ChannelFutureListener.CLOSE); + ctx.channel().write(response).addListener(ChannelFutureListener.CLOSE); ... ... - Channel ch = e.getChannel(); - Throwable cause = e.getCause(); + Channel ch = ctx.channel(); {code} h3. Change category #2: General API changes / Method signature changes. *2.1: SimpleChannelUpstreamHandler was renamed to ChannelInboundHandlerAdapter.* [https://netty.io/wiki/new-and-noteworthy-in-4.0.html#upstream--inbound-downstream--outbound] {quote}The terms 'upstream' and 'downstream' were pretty confusing to beginners. 4.0 uses 'inbound' and 'outbound' wherever possible. {quote} {code:java} - class Shuffle extends SimpleChannelUpstreamHandler { + @ChannelHandler.Sharable + class Shuffle extends ChannelInboundHandlerAdapter { {code} *2.2: Simplifed channel state model: [https://netty.io/wiki/new-and-noteworthy-in-4.0.html#simplified-channel-state-model]* {quote}channelOpen, channelBound, and channelConnected have been merged to channelActive. channelDisconnected, channelUnbound, and channelClosed have been merged to channelInactive. Likewise, Channel.isBound() and isConnected() have been merged to isActive(). {quote} *2.2.1 Changes in class: Shuffle* {code:java} @Override -public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent evt) +public void channelActive(ChannelHandlerContext ctx) throws Exception { - super.channelOpen(ctx, evt); + super.channelActive(ctx); {code} *2.2.2 Changes in org.apache.hadoop.mapred.ShuffleHandler.Shuffle#exceptionCaught:* Quoting the change again: {quote}channelOpen, channelBound, and channelConnected have been merged to channelActive. channelDisconnected, channelUnbound, and channelClosed have been merged to channelInactive. Likewise, Channel.isBound() and isConnected() have been merged to isActive(). {quote} {code:java} LOG.error("Shuffle error: ", cause); - if (ch.isConnected()) { -LOG.error("Shuffle error " + e); +
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357627#comment-17357627 ] Szilard Nemeth commented on HADOOP-15327: - Thanks [~weichiu], Will use skipShade temporarily then will check how to resolve the shading issue once all code issues are fixed and in place. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356433#comment-17356433 ] Wei-Chiu Chuang commented on HADOOP-15327: -- Thanks for reporting the issue... I usually skip shading (-DskipShade) to build during development so didn't notice it. We probably need to update our shading rule to exclude netty (shading and relocating netty is a delicate art) In addition to MR unit tests, if you can stand up a cluster, run the MR shuffler with the netty leak detection option (java -Dio.netty.leakDetection.level=advanced) enabled, we'll have more confidence in it. https://github.com/apache/hadoop/pull/2832#issuecomment-818602960 > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356401#comment-17356401 ] Szilard Nemeth commented on HADOOP-15327: - Hmm, can't attach the file, probably it's a permission issue. I don't want to paste 2000+ lines here. Uploaded the file to my personal Google Drive: https://drive.google.com/file/d/1-ovH8snqTS73oLNsxtwgrvaDVynOBIP7/view?usp=sharing > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Szilard Nemeth >Priority: Major > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356377#comment-17356377 ] Szilard Nemeth commented on HADOOP-15327: - Hi [~weichiu], I took over this Jira as it was unassigned, I hope it's not a problem. The plan is to continue the work based on your code changes until all the UT issues are fixed. Did you have any plan for testing? Is it enough to test basic MR-based jobs with shuffling? One more thing: The latest version on your branch ([https://github.com/jojochuang/hadoop/commits/shuffle_handler_netty4)] produced Maven enforcer issues for me. The command I'm using to build Hadoop (from its root): {code} mvn clean install -Pdist -DskipTests -Dmaven.javadoc.skip=true -e | tee /tmp/maven_out {code} Please see the attached output file. Did you encounter similar build issues? I can see the latest commit is form March 2021 so it's not that old so I don't assume the build system + enforcing rules were changed. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Priority: Major > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15327) Upgrade MR ShuffleHandler to use Netty4
[ https://issues.apache.org/jira/browse/HADOOP-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312744#comment-17312744 ] Wei-Chiu Chuang commented on HADOOP-15327: -- Here's my WIP patch for this migration: https://github.com/jojochuang/hadoop/commits/shuffle_handler_netty4 The TestShuffleHandler is having a few failures. The testMaxConnections UT failure is apparently caused by either a bad test or a behavior change in netty. The test expects connections to be received sequentially but connections are actually received in parallel. No order can be assumed. Other UTs in TestShuffleHandler are caused by socket timeouts. The clients receive the header of the response from server, but not the content. I suspected it was caused by Channel.write() API where it doesn't automatically reflush after write in netty4. I switched to writeAndFlush() but problems still persist. > Upgrade MR ShuffleHandler to use Netty4 > --- > > Key: HADOOP-15327 > URL: https://issues.apache.org/jira/browse/HADOOP-15327 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Wei-Chiu Chuang >Priority: Major > > This way, we can remove the dependencies on the netty3 (jboss.netty) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org