[ https://issues.apache.org/jira/browse/TEZ-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246728#comment-14246728 ]
Prakash Ramachandran commented on TEZ-1852: ------------------------------------------- The OrderedPartitionedKVEdgeConfig will not pick up this config and hence wont be set in the userpayload. See TEZ-1587 for details. Since setFromConfiguration seemed to be unnecessary it was removed in TEZ-1614. would need to look at how to pass the conf values to the OrderedPartitionedKVEdgeConfig better. [~sseth] / [~bikassaha] any thoughts on how to proceed with this one? > WordCount in local mode hangs in shuffle phase > ---------------------------------------------- > > Key: TEZ-1852 > URL: https://issues.apache.org/jira/browse/TEZ-1852 > Project: Apache Tez > Issue Type: Bug > Reporter: Jeff Zhang > > WordCount hangs at the progress of 50%, it looks like something wrong in the > shuffle phase. [~pramachandran], could you take at look at this ? I remember > you had fixed this issue before. But now it comes out again, I verify that it > works in 0.5.1, but it doesn't work starting from 0.5.2. > here's the log in when I run wordcount > {code} > 2014-12-15 12:19:33,874 INFO [Initializer 1] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:call(434)) - Initialized Output with dest > edge: Output > 2014-12-15 12:19:33,874 INFO [TezChild] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:initialize(223)) - All initializers > finished > 2014-12-15 12:19:33,874 INFO [TezChild] resources.MemoryDistributor > (MemoryDistributor.java:makeInitialAllocations(124)) - Using Allocator class: > org.apache.tez.runtime.library.resources.WeightedScalingMemoryDistributor > 2014-12-15 12:19:33,874 INFO [TezChild] > resources.WeightedScalingMemoryDistributor > (WeightedScalingMemoryDistributor.java:getRequestTypeForClass(197)) - Falling > back to RequestType.OTHER for class: org.apache.tez.mapreduce.output.MROutput > 2014-12-15 12:19:33,874 INFO [TezChild] > resources.WeightedScalingMemoryDistributor > (WeightedScalingMemoryDistributor.java:initialProcessMemoryRequestContext(171)) > - ScaleFactor: 1, for type: OTHER > 2014-12-15 12:19:33,874 INFO [TezChild] > resources.WeightedScalingMemoryDistributor > (WeightedScalingMemoryDistributor.java:initialProcessMemoryRequestContext(171)) > - ScaleFactor: 12, for type: SORTED_MERGED_INPUT > 2014-12-15 12:19:33,875 INFO [TezChild] > resources.WeightedScalingMemoryDistributor > (WeightedScalingMemoryDistributor.java:computeReservedFraction(262)) - > InitialReservationFraction=0.3, AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32999999999999996 > 2014-12-15 12:19:33,875 INFO [TezChild] > resources.WeightedScalingMemoryDistributor > (WeightedScalingMemoryDistributor.java:assignMemory(126)) - Scaling Requests. > NumRequests: 2, numScaledRequests: 13, TotalRequested: 859255552, > TotalRequestedScaled: 7.931589710769231E8, TotalJVMHeap: 954728448, > TotalAvailable: 639668060, TotalRequested/TotalJVMHeap:0.90 > 2014-12-15 12:19:33,875 INFO [TezChild] resources.MemoryDistributor > (MemoryDistributor.java:makeInitialAllocations(138)) - Informing: OUTPUT, > Output, org.apache.tez.mapreduce.output.MROutput: requested=0, allocated=0 > 2014-12-15 12:19:33,875 INFO [TezChild] resources.MemoryDistributor > (MemoryDistributor.java:makeInitialAllocations(138)) - Informing: INPUT, > Tokenizer, org.apache.tez.runtime.library.input.OrderedGroupedKVInput: > requested=859255552, allocated=639668060 > 2014-12-15 12:19:33,875 INFO [TezChild] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:initialize(245)) - Starting Inputs/Outputs > 2014-12-15 12:19:33,875 INFO [TezChild] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:initialize(258)) - Input: Tokenizer being > auto started by the framework. Subsequent instances will not be auto-started > 2014-12-15 12:19:33,875 INFO [Initializer 0] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:call(402)) - Starting Input with src > edge: Tokenizer > 2014-12-15 12:19:33,875 INFO [TezChild] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:initialize(279)) - Num IOs determined for > AutoStart: 1 > 2014-12-15 12:19:33,875 INFO [TezChild] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:initialize(281)) - Waiting for 1 IOs to > start > 2014-12-15 12:19:33,882 INFO [Initializer 0] common.TezRuntimeUtils > (TezRuntimeUtils.java:instantiateCombiner(65)) - No combiner specified via > tez.runtime.combiner.class. Combiner will not be used > 2014-12-15 12:19:33,883 INFO [Initializer 0] orderedgrouped.Shuffle > (Shuffle.java:<init>(179)) - Shuffle assigned with 1 inputs, codec: > NoneifileReadAhead: true > 2014-12-15 12:19:33,886 INFO [Initializer 0] orderedgrouped.ShuffleScheduler > (ShuffleScheduler.java:<init>(150)) - ShuffleScheduler running for > sourceVertex: Tokenizer with configuration: > maxFetchFailuresBeforeReporting=5, reportReadErrorImmediately=true, > maxFailedUniqueFetches=1, abortFailureLimit=30, maxMapRuntime=0 > 2014-12-15 12:19:33,887 INFO [Initializer 0] orderedgrouped.MergeManager > (MergeManager.java:<init>(221)) - InitialRequest: ShuffleMem=859255552, > postMergeMem=0, RuntimeTotalAvailable=639668060. Updated to: > ShuffleMem=639668060, postMergeMem: 0 > 2014-12-15 12:19:33,887 INFO [Initializer 0] orderedgrouped.MergeManager > (MergeManager.java:<init>(252)) - MergerManager: memoryLimit=639668060, > maxSingleShuffleLimit=159917008, mergeThreshold=575701184, ioSortFactor=100, > memToMemMergeOutputsThreshold=100 > 2014-12-15 12:19:33,890 INFO [Initializer 0] orderedgrouped.Shuffle > (Shuffle.java:<init>(226)) - Num fetchers being started: 1 > 2014-12-15 12:19:33,891 INFO [Initializer 0] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:call(404)) - Started Input with src edge: > Tokenizer > 2014-12-15 12:19:33,891 INFO [TezChild] > runtime.LogicalIOProcessorRuntimeTask > (LogicalIOProcessorRuntimeTask.java:initialize(294)) - AutoStartComplete > 2014-12-15 12:19:33,891 INFO [TezChild] task.TezTaskRunner > (TezTaskRunner.java:run(175)) - Running task, > taskAttemptId=attempt_1418617172198_0001_1_01_000000_0 > 2014-12-15 12:19:33,891 INFO > [TezTaskEventRouter[attempt_1418617172198_0001_1_01_000000_0]] > orderedgrouped.ShuffleInputEventHandlerOrderedGrouped > (ShuffleInputEventHandlerOrderedGrouped.java:processDataMovementEvent(83)) - > DME srcIdx: 0, targetIdx: 0, attemptNum: 0, payload: [host: jzhangMBPr.local, > port: 0, pathComponent: attempt_1418617172198_0001_1_00_000000_0_10003, > runDuration: 327458, ] > 2014-12-15 12:19:34,311 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - DAG: State: RUNNING Progress: 50% TotalTasks: > 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:34,312 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Tokenizer > Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 > 2014-12-15 12:19:34,313 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Summation > Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:39,330 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - DAG: State: RUNNING Progress: 50% TotalTasks: > 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:39,331 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Tokenizer > Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 > 2014-12-15 12:19:39,331 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Summation > Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:44,348 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - DAG: State: RUNNING Progress: 50% TotalTasks: > 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:44,349 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Tokenizer > Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 > 2014-12-15 12:19:44,350 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Summation > Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:49,365 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - DAG: State: RUNNING Progress: 50% TotalTasks: > 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:49,367 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Tokenizer > Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 > 2014-12-15 12:19:49,369 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Summation > Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:54,388 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - DAG: State: RUNNING Progress: 50% TotalTasks: > 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:54,390 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Tokenizer > Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 > 2014-12-15 12:19:54,392 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - VertexStatus: VertexName: Summation > Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 1 Failed: 0 Killed: 0 > 2014-12-15 12:19:59,406 INFO [main] client.DAGClientImpl > (DAGClientImpl.java:log(463)) - DAG: State: RUNNING Progress: 50% TotalTasks: > 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 > {code} > You can reproduce this by adding these 3 lines of code in WordCount > {code} > tezConf.setBoolean(TezConfiguration.TEZ_LOCAL_MODE, true); > tezConf.set("fs.defaultFS", "file:///"); > > tezConf.setBoolean(TezRuntimeConfiguration.TEZ_RUNTIME_OPTIMIZE_LOCAL_FETCH, > true); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)