[ https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15590836#comment-15590836 ]
Bikas Saha commented on TEZ-3222: --------------------------------- {code} - return commonRouteMeta[sourceTaskIndex]; + return CompositeEventRouteMetadata.create(1, sourceTaskIndex, 0); {code} The removed code is looking up an array indexed by sourceTaskIndex while the new code is directly using the sourceTaskIndex. Is there a difference? Also, reusing the caching (as done earlier) may improve critical path CPU for object creation. Though for broadcast edge I am not sure if CDME is used as of now. {code} +message CompositeRoutedDataMovementEventProto { + optional int32 source_index = 1; + optional int32 target_index = 2; + optional int32 count = 3; + optional bytes user_payload = 4; + optional int32 version = 5; +}{code} Can we create a message for CompositeRouteMeta and use it vs expanding its contents. That way CompositeRouteMeta could evolve independently. {code} if (event instanceof DataMovementEvent) { numDmeEvents.incrementAndGet(); - processDataMovementEvent((DataMovementEvent)event); + DataMovementEvent dmEvent = (DataMovementEvent)event; + DataMovementEventPayloadProto shufflePayload; + try { + shufflePayload = DataMovementEventPayloadProto.parseFrom(ByteString.copyFrom(dmEvent.getUserPayload())); + } catch (InvalidProtocolBufferException e) { + throw new TezUncheckedException("Unable to parse DataMovementEvent payload", e); + } + BitSet emptyPartitionsBitSet = null; + if (shufflePayload.hasEmptyPartitions()) { + try { + byte[] emptyPartitions = TezCommonUtils.decompressByteStringToByteArray(shufflePayload.getEmptyPartitions(), inflater); {code} I dont think DME's dont have empty partition bitset since they dont have multi-partition data. Right? [~rajesh.balamohan] Rest looks good to me. +1 Thanks! > Reduce messaging overhead for auto-reduce parallelism case > ---------------------------------------------------------- > > Key: TEZ-3222 > URL: https://issues.apache.org/jira/browse/TEZ-3222 > Project: Apache Tez > Issue Type: Bug > Reporter: Jonathan Eagles > Assignee: Jonathan Eagles > Attachments: TEZ-3222.1.patch, TEZ-3222.2.patch, TEZ-3222.3.patch, > TEZ-3222.4.patch, TEZ-3222.5.patch, TEZ-3222.6.patch > > > A dag with 15k x 1000k vertex may auto-reduce to 15k x 1. And while the data > size is appropriate for 1 task attempt, this results in an increase in task > attempt message processing of 1000x. > This jira aims to reduce the message processing in the auto-reduced task > while keeping the amount of message processing in the AM the same or less. -- This message was sent by Atlassian JIRA (v6.3.4#6332)