[jira] [Updated] (TEZ-3165) Allow Inputs/Outputs to be initialized serially, control processor initialization relative to Inputs/Outputs
[ https://issues.apache.org/jira/browse/TEZ-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3165: - Attachment: TEZ-3165.4-branch-0.7.patch > Allow Inputs/Outputs to be initialized serially, control processor > initialization relative to Inputs/Outputs > > > Key: TEZ-3165 > URL: https://issues.apache.org/jira/browse/TEZ-3165 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3165.1.patch, TEZ-3165.2.patch, TEZ-3165.3.patch, > TEZ-3165.4-branch-0.7.patch, TEZ-3165.4.patch > > > 2016-03-13 23:55:17,162 [INFO] [main] > |runtime.LogicalIOProcessorRuntimeTask|: Initializing > LogicalIOProcessorRuntimeTask with TaskSpec: DAGName : > PigLatin:Script.pig-0_scope-0, VertexName: scope-203, VertexParallelism: > 2707, TaskAttemptID:attempt_1, > processorName=org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor, > inputSpecListSize=1, outputSpecListSize=1, inputSpecList=[{{ > sourceVertexName=scope-0, physicalEdgeCount=1, > inputClassName=org.apache.tez.mapreduce.input.MRInput }}, ], > outputSpecList=[{{ destinationVertexName=scope-28, physicalEdgeCount=0, > outputClassName=org.apache.tez.mapreduce.output.MROutput }}, ] > 2016-03-13 23:55:17,164 [INFO] [main] |resources.MemoryDistributor|: > InitialMemoryDistributor (isEnabled=true) invoked with: numInputs=1, > numOutputs=1, JVM.maxFree=1059061760, > allocatorClassName=org.apache.tez.runtime.library.resources.WeightedScalingMemoryDistributor > 2016-03-13 23:55:17,175 [INFO] [TezChild] |task.TezTaskRunner|: Initializing > task, taskAttemptId=attempt_1 > 2016-03-13 23:55:17,182 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: > Routing events from heartbeat response to task, > currentTaskAttemptId=attempt_1, eventCount=1 fromEventId=0 nextFromEventId=0 > 2016-03-13 23:55:17,212 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |Configuration.deprecation|: mapreduce.inputformat.class is deprecated. > Instead, use mapreduce.job.inputformat.class > 2016-03-13 23:55:17,214 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |Configuration.deprecation|: fs.default.name is deprecated. Instead, use > fs.defaultFS > 2016-03-13 23:55:17,223 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |counters.Limits|: Counter limits initialized with parameters: > GROUP_NAME_MAX=256, MAX_GROUPS=1000, COUNTER_NAME_MAX=128, MAX_COUNTERS=5000 > 2016-03-13 23:55:17,228 [INFO] [I/O Setup 0 Initialize: {scope-0}] > |input.MRInput|: scope-0 using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-03-13 23:55:17,233 [INFO] [I/O Setup 0 Initialize: {scope-0}] > |input.MRInput|: Initialized MRInput: scope-0 > 2016-03-13 23:55:17,345 [INFO] [TezChild] |data.SchemaTupleBackend|: Key > [pig.schematuple] was not set... will not generate code. > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-03-13 23:55:17,400 [INFO] [TezChild] |task.TezTaskRunner|: Encounted an > error while executing task: attempt_1 > java.lang.RuntimeException: could not instantiate > 'com.twitter.elephantbird.pig.store.SequenceFileStorage' with arguments '[-c > com.twitter.elephantbird.pig.util.TextConverter, -c > com.twitter.elephantbird.pig.util.TextConverter]' > at > org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:766) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:250) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:76) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigOutputFormatTez.getRecordWriter(PigOutputFormatTez.java:43) > at > org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:399) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:506) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:489) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:474) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > ja
[jira] [Updated] (TEZ-3165) Allow Inputs/Outputs to be initialized serially, control processor initialization relative to Inputs/Outputs
[ https://issues.apache.org/jira/browse/TEZ-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3165: - Attachment: TEZ-3165.4.patch Thanks for the review [~hitesh] and [~sseth]. Addressed comments and started the commit process. > Allow Inputs/Outputs to be initialized serially, control processor > initialization relative to Inputs/Outputs > > > Key: TEZ-3165 > URL: https://issues.apache.org/jira/browse/TEZ-3165 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3165.1.patch, TEZ-3165.2.patch, TEZ-3165.3.patch, > TEZ-3165.4.patch > > > 2016-03-13 23:55:17,162 [INFO] [main] > |runtime.LogicalIOProcessorRuntimeTask|: Initializing > LogicalIOProcessorRuntimeTask with TaskSpec: DAGName : > PigLatin:Script.pig-0_scope-0, VertexName: scope-203, VertexParallelism: > 2707, TaskAttemptID:attempt_1, > processorName=org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor, > inputSpecListSize=1, outputSpecListSize=1, inputSpecList=[{{ > sourceVertexName=scope-0, physicalEdgeCount=1, > inputClassName=org.apache.tez.mapreduce.input.MRInput }}, ], > outputSpecList=[{{ destinationVertexName=scope-28, physicalEdgeCount=0, > outputClassName=org.apache.tez.mapreduce.output.MROutput }}, ] > 2016-03-13 23:55:17,164 [INFO] [main] |resources.MemoryDistributor|: > InitialMemoryDistributor (isEnabled=true) invoked with: numInputs=1, > numOutputs=1, JVM.maxFree=1059061760, > allocatorClassName=org.apache.tez.runtime.library.resources.WeightedScalingMemoryDistributor > 2016-03-13 23:55:17,175 [INFO] [TezChild] |task.TezTaskRunner|: Initializing > task, taskAttemptId=attempt_1 > 2016-03-13 23:55:17,182 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: > Routing events from heartbeat response to task, > currentTaskAttemptId=attempt_1, eventCount=1 fromEventId=0 nextFromEventId=0 > 2016-03-13 23:55:17,212 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |Configuration.deprecation|: mapreduce.inputformat.class is deprecated. > Instead, use mapreduce.job.inputformat.class > 2016-03-13 23:55:17,214 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |Configuration.deprecation|: fs.default.name is deprecated. Instead, use > fs.defaultFS > 2016-03-13 23:55:17,223 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |counters.Limits|: Counter limits initialized with parameters: > GROUP_NAME_MAX=256, MAX_GROUPS=1000, COUNTER_NAME_MAX=128, MAX_COUNTERS=5000 > 2016-03-13 23:55:17,228 [INFO] [I/O Setup 0 Initialize: {scope-0}] > |input.MRInput|: scope-0 using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-03-13 23:55:17,233 [INFO] [I/O Setup 0 Initialize: {scope-0}] > |input.MRInput|: Initialized MRInput: scope-0 > 2016-03-13 23:55:17,345 [INFO] [TezChild] |data.SchemaTupleBackend|: Key > [pig.schematuple] was not set... will not generate code. > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-03-13 23:55:17,400 [INFO] [TezChild] |task.TezTaskRunner|: Encounted an > error while executing task: attempt_1 > java.lang.RuntimeException: could not instantiate > 'com.twitter.elephantbird.pig.store.SequenceFileStorage' with arguments '[-c > com.twitter.elephantbird.pig.util.TextConverter, -c > com.twitter.elephantbird.pig.util.TextConverter]' > at > org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:766) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:250) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:76) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigOutputFormatTez.getRecordWriter(PigOutputFormatTez.java:43) > at > org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:399) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:506) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:489) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:474) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.conc
[jira] [Updated] (TEZ-3165) Allow Inputs/Outputs to be initialized serially, control processor initialization relative to Inputs/Outputs
[ https://issues.apache.org/jira/browse/TEZ-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-3165: Summary: Allow Inputs/Outputs to be initialized serially, control processor initialization relative to Inputs/Outputs (was: Parallel initialization of inputs, outputs, and processor can cause NoSuchMethodException) > Allow Inputs/Outputs to be initialized serially, control processor > initialization relative to Inputs/Outputs > > > Key: TEZ-3165 > URL: https://issues.apache.org/jira/browse/TEZ-3165 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3165.1.patch, TEZ-3165.2.patch, TEZ-3165.3.patch > > > 2016-03-13 23:55:17,162 [INFO] [main] > |runtime.LogicalIOProcessorRuntimeTask|: Initializing > LogicalIOProcessorRuntimeTask with TaskSpec: DAGName : > PigLatin:Script.pig-0_scope-0, VertexName: scope-203, VertexParallelism: > 2707, TaskAttemptID:attempt_1, > processorName=org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor, > inputSpecListSize=1, outputSpecListSize=1, inputSpecList=[{{ > sourceVertexName=scope-0, physicalEdgeCount=1, > inputClassName=org.apache.tez.mapreduce.input.MRInput }}, ], > outputSpecList=[{{ destinationVertexName=scope-28, physicalEdgeCount=0, > outputClassName=org.apache.tez.mapreduce.output.MROutput }}, ] > 2016-03-13 23:55:17,164 [INFO] [main] |resources.MemoryDistributor|: > InitialMemoryDistributor (isEnabled=true) invoked with: numInputs=1, > numOutputs=1, JVM.maxFree=1059061760, > allocatorClassName=org.apache.tez.runtime.library.resources.WeightedScalingMemoryDistributor > 2016-03-13 23:55:17,175 [INFO] [TezChild] |task.TezTaskRunner|: Initializing > task, taskAttemptId=attempt_1 > 2016-03-13 23:55:17,182 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: > Routing events from heartbeat response to task, > currentTaskAttemptId=attempt_1, eventCount=1 fromEventId=0 nextFromEventId=0 > 2016-03-13 23:55:17,212 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |Configuration.deprecation|: mapreduce.inputformat.class is deprecated. > Instead, use mapreduce.job.inputformat.class > 2016-03-13 23:55:17,214 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |Configuration.deprecation|: fs.default.name is deprecated. Instead, use > fs.defaultFS > 2016-03-13 23:55:17,223 [INFO] [I/O Setup 1 Initialize: {scope-28}] > |counters.Limits|: Counter limits initialized with parameters: > GROUP_NAME_MAX=256, MAX_GROUPS=1000, COUNTER_NAME_MAX=128, MAX_COUNTERS=5000 > 2016-03-13 23:55:17,228 [INFO] [I/O Setup 0 Initialize: {scope-0}] > |input.MRInput|: scope-0 using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-03-13 23:55:17,233 [INFO] [I/O Setup 0 Initialize: {scope-0}] > |input.MRInput|: Initialized MRInput: scope-0 > 2016-03-13 23:55:17,345 [INFO] [TezChild] |data.SchemaTupleBackend|: Key > [pig.schematuple] was not set... will not generate code. > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-03-13 23:55:17,400 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-03-13 23:55:17,400 [INFO] [TezChild] |task.TezTaskRunner|: Encounted an > error while executing task: attempt_1 > java.lang.RuntimeException: could not instantiate > 'com.twitter.elephantbird.pig.store.SequenceFileStorage' with arguments '[-c > com.twitter.elephantbird.pig.util.TextConverter, -c > com.twitter.elephantbird.pig.util.TextConverter]' > at > org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:766) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:250) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:76) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigOutputFormatTez.getRecordWriter(PigOutputFormatTez.java:43) > at > org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:399) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:506) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:489) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:474) > at org.apache.tez.common.Call