[jira] [Created] (TEZ-1619) Upgrade Hadoop dependency to 2.5

2014-09-24 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-1619:
---

 Summary: Upgrade Hadoop dependency to 2.5
 Key: TEZ-1619
 URL: https://issues.apache.org/jira/browse/TEZ-1619
 Project: Apache Tez
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Bikas Saha






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-1620) Wait for application finish before stopping MiniTezCluster

2014-09-24 Thread Jeff Zhang (JIRA)
Jeff Zhang created TEZ-1620:
---

 Summary: Wait for application finish before stopping MiniTezCluster
 Key: TEZ-1620
 URL: https://issues.apache.org/jira/browse/TEZ-1620
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang


Currently, we sleep 10 seconds to wait for DAGAppMaster to finish, otherwise 
DAGAppMaster will hang there for connecting RM to unregister. 
We should wait for all the applications finish before stopping MiniTezCluster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1568) Add system test for propagation of diagnostics for errors

2014-09-24 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-1568:

Attachment: Tez-1568-8.patch

 Add system test for propagation of diagnostics for errors
 -

 Key: TEZ-1568
 URL: https://issues.apache.org/jira/browse/TEZ-1568
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-1568.patch, Tez-1568-2.patch, Tez-1568-3.patch, 
 Tez-1568-4.patch, Tez-1568-5.patch, Tez-1568-6.patch, Tez-1568-7.patch, 
 Tez-1568-8.patch


 Design system test where exception come from Input, Output, Processor, 
 InputInitializer and VertexManagerPlugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1568) Add system test for propagation of diagnostics for errors

2014-09-24 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14145986#comment-14145986
 ] 

Jeff Zhang commented on TEZ-1568:
-

[~bikassaha]  Attach the new patch with comment refined.

Regarding the sleeping 10 seconds, I still keep it there. Because remove it 
will cause DAGAppMaster hang there. I create 
[TEZ-1620|https://issues.apache.org/jira/browse/TEZ-1620] for this issue. 

 Add system test for propagation of diagnostics for errors
 -

 Key: TEZ-1568
 URL: https://issues.apache.org/jira/browse/TEZ-1568
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-1568.patch, Tez-1568-2.patch, Tez-1568-3.patch, 
 Tez-1568-4.patch, Tez-1568-5.patch, Tez-1568-6.patch, Tez-1568-7.patch, 
 Tez-1568-8.patch


 Design system test where exception come from Input, Output, Processor, 
 InputInitializer and VertexManagerPlugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1608) TopK example

2014-09-24 Thread Krisztian Horvath (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146038#comment-14146038
 ] 

Krisztian Horvath commented on TEZ-1608:


I received few improvements which I'm going to apply:
Also, for the topK, can the sum task maintain a local top K and output only 
that much and the writer can pick the global topK from the local topKs. Would 
reduce the data transfer quite a bit. Then we may be able to use an 
UnorderedKVEdge instead of an OrderedPartitionedKVEdge? That will avoid the 
need to sort at the output and merge sort at the input.

 TopK example
 

 Key: TEZ-1608
 URL: https://issues.apache.org/jira/browse/TEZ-1608
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: 0.5.0
Reporter: Janos Matyas
 Attachments: TEZ-1608-1.patch


 The goal of this sample is to find the topK elements of a dataset, while 
 guiding through the basics of Tez (DAG creation, tokenizers, custom 
 comparators and parallelism). 
 An example use case for top K:
   Given a large data set in CSV format of user comments on a site listed as: 
 userid,postid,commentid,comment,timestamp and we are looking for the top K 
 commenter or the posts with the most comment. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1612) Pig on tez unit test intermittent hang

2014-09-24 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146265#comment-14146265
 ] 

Rajesh Balamohan commented on TEZ-1612:
---

Missed the fact that both the vertexmanagers are ShuffleVertexManagers in this 
case.  Yes, the fix in master wasn't made for ShuffleVertexManager.  Daniel, 
can you please post the logs with the latest run?

 Pig on tez unit test intermittent hang
 --

 Key: TEZ-1612
 URL: https://issues.apache.org/jira/browse/TEZ-1612
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Daniel Dai
Assignee: Bikas Saha
 Attachments: DAG1.png, syslog_dag_1411413615885_0001_1, 
 testfail1.log.tar.gz


 Several Pig unit tests hang intermittently. For example, 
 TestNewPlanImplicitSplit.testImplicitSplitInCoGroup, which is a DAG of 4 
 nodes:
 !DAG1.png!
 It uses auto-parallelism, vertex 106 change parallelism from 2-1, and vertex 
 107 from 21-1.
 Log attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1608) TopK example

2014-09-24 Thread Krisztian Horvath (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Horvath updated TEZ-1608:
---
Attachment: TEZ-1608-2.patch

 TopK example
 

 Key: TEZ-1608
 URL: https://issues.apache.org/jira/browse/TEZ-1608
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: 0.5.0
Reporter: Janos Matyas
 Attachments: TEZ-1608-1.patch, TEZ-1608-2.patch


 The goal of this sample is to find the topK elements of a dataset, while 
 guiding through the basics of Tez (DAG creation, tokenizers, custom 
 comparators and parallelism). 
 An example use case for top K:
   Given a large data set in CSV format of user comments on a site listed as: 
 userid,postid,commentid,comment,timestamp and we are looking for the top K 
 commenter or the posts with the most comment. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1608) TopK example

2014-09-24 Thread Krisztian Horvath (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146322#comment-14146322
 ] 

Krisztian Horvath commented on TEZ-1608:


In the second version every Sum processor maintains a local top K and only 
writes these values when the task finishes, so the writer only has to select 
the top K from the local top Ks. The implementation is a bit more complex than 
the previous one.

 TopK example
 

 Key: TEZ-1608
 URL: https://issues.apache.org/jira/browse/TEZ-1608
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: 0.5.0
Reporter: Janos Matyas
 Attachments: TEZ-1608-1.patch, TEZ-1608-2.patch


 The goal of this sample is to find the topK elements of a dataset, while 
 guiding through the basics of Tez (DAG creation, tokenizers, custom 
 comparators and parallelism). 
 An example use case for top K:
   Given a large data set in CSV format of user comments on a site listed as: 
 userid,postid,commentid,comment,timestamp and we are looking for the top K 
 commenter or the posts with the most comment. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1568) Add system test for propagation of diagnostics for errors

2014-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146864#comment-14146864
 ] 

Bikas Saha commented on TEZ-1568:
-

Ok. I will commit this and we should remove it in TEZ-1620.

 Add system test for propagation of diagnostics for errors
 -

 Key: TEZ-1568
 URL: https://issues.apache.org/jira/browse/TEZ-1568
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-1568.patch, Tez-1568-2.patch, Tez-1568-3.patch, 
 Tez-1568-4.patch, Tez-1568-5.patch, Tez-1568-6.patch, Tez-1568-7.patch, 
 Tez-1568-8.patch


 Design system test where exception come from Input, Output, Processor, 
 InputInitializer and VertexManagerPlugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1619) Upgrade Hadoop dependency to 2.5

2014-09-24 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146964#comment-14146964
 ] 

Jonathan Eagles commented on TEZ-1619:
--

+1. This change looks good. Let's ping the pig and hive teams so we can make 
sure we are in alignment.

 Upgrade Hadoop dependency to 2.5
 

 Key: TEZ-1619
 URL: https://issues.apache.org/jira/browse/TEZ-1619
 Project: Apache Tez
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-1619.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1612) Pig on tez unit test intermittent hang

2014-09-24 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147016#comment-14147016
 ] 

Daniel Dai commented on TEZ-1612:
-

What do you mean the latest run? With the master?

 Pig on tez unit test intermittent hang
 --

 Key: TEZ-1612
 URL: https://issues.apache.org/jira/browse/TEZ-1612
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Daniel Dai
Assignee: Bikas Saha
 Attachments: DAG1.png, syslog_dag_1411413615885_0001_1, 
 testfail1.log.tar.gz


 Several Pig unit tests hang intermittently. For example, 
 TestNewPlanImplicitSplit.testImplicitSplitInCoGroup, which is a DAG of 4 
 nodes:
 !DAG1.png!
 It uses auto-parallelism, vertex 106 change parallelism from 2-1, and vertex 
 107 from 21-1.
 Log attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1612) Pig on tez unit test intermittent hang

2014-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147072#comment-14147072
 ] 

Bikas Saha commented on TEZ-1612:
-

Yes. For the same test that you attached the DAG picture for above

 Pig on tez unit test intermittent hang
 --

 Key: TEZ-1612
 URL: https://issues.apache.org/jira/browse/TEZ-1612
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Daniel Dai
Assignee: Bikas Saha
 Attachments: DAG1.png, syslog_dag_1411413615885_0001_1, 
 testfail1.log.tar.gz


 Several Pig unit tests hang intermittently. For example, 
 TestNewPlanImplicitSplit.testImplicitSplitInCoGroup, which is a DAG of 4 
 nodes:
 !DAG1.png!
 It uses auto-parallelism, vertex 106 change parallelism from 2-1, and vertex 
 107 from 21-1.
 Log attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-1621) Actual error message not thrown on console, does appear in the YARN application log

2014-09-24 Thread Deepesh Khandelwal (JIRA)
Deepesh Khandelwal created TEZ-1621:
---

 Summary: Actual error message not thrown on console, does appear 
in the YARN application log
 Key: TEZ-1621
 URL: https://issues.apache.org/jira/browse/TEZ-1621
 Project: Apache Tez
  Issue Type: Bug
Reporter: Deepesh Khandelwal


While running an in session testorderedwordcount example the DAG failed with 
the following error on the console:
{noformat}
14/09/25 01:55:53 INFO examples.TestOrderedWordCount: DAG 1 diagnostics: 
[Vertex failed, vertexName=initialmap, vertexId=vertex_1411586515507_0110_1_00, 
diagnostics=[Task failed, taskId=task_1411586515507_0110_1_00_00, 
diagnostics=[TaskAttempt 0 failed, info=[Container 
container_1411586515507_0110_01_02 finished with diagnostics set to 
[Container failed. Exception from container-launch.
Container id: container_1411586515507_0110_01_02
Exit code: 255
Stack trace: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:290)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This wasn't very helpful, the root cause is in the application log:
{noformat}
2014-09-25 01:55:41,246 ERROR [TezChild] 
org.apache.tez.runtime.task.TezTaskRunner: Exception of type Error. Exiting now
java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;J)V
at org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(Native 
Method)
at 
org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:57)
at 
org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:291)
at 
org.apache.hadoop.hdfs.BlockReaderLocal.fillBuffer(BlockReaderLocal.java:344)
at 
org.apache.hadoop.hdfs.BlockReaderLocal.fillDataBuf(BlockReaderLocal.java:444)
at 
org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:575)
at 
org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:539)
at 
org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:683)
at 
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:739)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:796)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
at 
org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at 
org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:149)
at 
org.apache.hadoop.mapreduce.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.nextKeyValue(TezGroupedSplitsInputFormat.java:167)
at 
org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
at 
org.apache.tez.mapreduce.processor.map.MapProcessor$NewRecordReader.nextKeyValue(MapProcessor.java:266)
at 
org.apache.tez.mapreduce.hadoop.mapreduce.MapContextImpl.nextKeyValue(MapContextImpl.java:81)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at 
org.apache.tez.mapreduce.processor.map.MapProcessor.runNewMapper(MapProcessor.java:237)
at 
org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:124)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 

[jira] [Updated] (TEZ-1621) Actual error message not thrown on console, does appear in the YARN application log

2014-09-24 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated TEZ-1621:

Attachment: console.txt
app_logs.txt

 Actual error message not thrown on console, does appear in the YARN 
 application log
 ---

 Key: TEZ-1621
 URL: https://issues.apache.org/jira/browse/TEZ-1621
 Project: Apache Tez
  Issue Type: Bug
Reporter: Deepesh Khandelwal
 Attachments: app_logs.txt, console.txt


 While running an in session testorderedwordcount example the DAG failed with 
 the following error on the console:
 {noformat}
 14/09/25 01:55:53 INFO examples.TestOrderedWordCount: DAG 1 diagnostics: 
 [Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1411586515507_0110_1_00, diagnostics=[Task failed, 
 taskId=task_1411586515507_0110_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Container container_1411586515507_0110_01_02 finished with 
 diagnostics set to [Container failed. Exception from container-launch.
 Container id: container_1411586515507_0110_01_02
 Exit code: 255
 Stack trace: ExitCodeException exitCode=255:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:290)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This wasn't very helpful, the root cause is in the application log:
 {noformat}
 2014-09-25 01:55:41,246 ERROR [TezChild] 
 org.apache.tez.runtime.task.TezTaskRunner: Exception of type Error. Exiting 
 now
 java.lang.UnsatisfiedLinkError: 
 org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;J)V
 at org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(Native 
 Method)
 at 
 org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:57)
 at 
 org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:291)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillBuffer(BlockReaderLocal.java:344)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillDataBuf(BlockReaderLocal.java:444)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:575)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:539)
 at 
 org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:683)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:739)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:796)
 at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
 at java.io.DataInputStream.read(DataInputStream.java:100)
 at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
 at 
 org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:149)
 at 
 org.apache.hadoop.mapreduce.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.nextKeyValue(TezGroupedSplitsInputFormat.java:167)
 at 
 org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor$NewRecordReader.nextKeyValue(MapProcessor.java:266)
 at 
 org.apache.tez.mapreduce.hadoop.mapreduce.MapContextImpl.nextKeyValue(MapContextImpl.java:81)
 at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor.runNewMapper(MapProcessor.java:237)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:124)
 at 
 

[jira] [Updated] (TEZ-1621) Actual error message not thrown on console, does appear in the YARN application log

2014-09-24 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-1621:

Issue Type: Sub-task  (was: Bug)
Parent: TEZ-1240

 Actual error message not thrown on console, does appear in the YARN 
 application log
 ---

 Key: TEZ-1621
 URL: https://issues.apache.org/jira/browse/TEZ-1621
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Deepesh Khandelwal
 Attachments: app_logs.txt, console.txt


 While running an in session testorderedwordcount example the DAG failed with 
 the following error on the console:
 {noformat}
 14/09/25 01:55:53 INFO examples.TestOrderedWordCount: DAG 1 diagnostics: 
 [Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1411586515507_0110_1_00, diagnostics=[Task failed, 
 taskId=task_1411586515507_0110_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Container container_1411586515507_0110_01_02 finished with 
 diagnostics set to [Container failed. Exception from container-launch.
 Container id: container_1411586515507_0110_01_02
 Exit code: 255
 Stack trace: ExitCodeException exitCode=255:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:290)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This wasn't very helpful, the root cause is in the application log:
 {noformat}
 2014-09-25 01:55:41,246 ERROR [TezChild] 
 org.apache.tez.runtime.task.TezTaskRunner: Exception of type Error. Exiting 
 now
 java.lang.UnsatisfiedLinkError: 
 org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;J)V
 at org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(Native 
 Method)
 at 
 org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:57)
 at 
 org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:291)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillBuffer(BlockReaderLocal.java:344)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillDataBuf(BlockReaderLocal.java:444)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:575)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:539)
 at 
 org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:683)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:739)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:796)
 at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
 at java.io.DataInputStream.read(DataInputStream.java:100)
 at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
 at 
 org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:149)
 at 
 org.apache.hadoop.mapreduce.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.nextKeyValue(TezGroupedSplitsInputFormat.java:167)
 at 
 org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor$NewRecordReader.nextKeyValue(MapProcessor.java:266)
 at 
 org.apache.tez.mapreduce.hadoop.mapreduce.MapContextImpl.nextKeyValue(MapContextImpl.java:81)
 at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor.runNewMapper(MapProcessor.java:237)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:124)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   

[jira] [Commented] (TEZ-1621) Actual error message not thrown on console, does appear in the YARN application log

2014-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147364#comment-14147364
 ] 

Bikas Saha commented on TEZ-1621:
-

Not sure why we treat that separately and shutdown.

 Actual error message not thrown on console, does appear in the YARN 
 application log
 ---

 Key: TEZ-1621
 URL: https://issues.apache.org/jira/browse/TEZ-1621
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Deepesh Khandelwal
 Attachments: app_logs.txt, console.txt


 While running an in session testorderedwordcount example the DAG failed with 
 the following error on the console:
 {noformat}
 14/09/25 01:55:53 INFO examples.TestOrderedWordCount: DAG 1 diagnostics: 
 [Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1411586515507_0110_1_00, diagnostics=[Task failed, 
 taskId=task_1411586515507_0110_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Container container_1411586515507_0110_01_02 finished with 
 diagnostics set to [Container failed. Exception from container-launch.
 Container id: container_1411586515507_0110_01_02
 Exit code: 255
 Stack trace: ExitCodeException exitCode=255:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:290)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This wasn't very helpful, the root cause is in the application log:
 {noformat}
 2014-09-25 01:55:41,246 ERROR [TezChild] 
 org.apache.tez.runtime.task.TezTaskRunner: Exception of type Error. Exiting 
 now
 java.lang.UnsatisfiedLinkError: 
 org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;J)V
 at org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(Native 
 Method)
 at 
 org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:57)
 at 
 org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:291)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillBuffer(BlockReaderLocal.java:344)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillDataBuf(BlockReaderLocal.java:444)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:575)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:539)
 at 
 org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:683)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:739)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:796)
 at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
 at java.io.DataInputStream.read(DataInputStream.java:100)
 at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
 at 
 org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:149)
 at 
 org.apache.hadoop.mapreduce.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.nextKeyValue(TezGroupedSplitsInputFormat.java:167)
 at 
 org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor$NewRecordReader.nextKeyValue(MapProcessor.java:266)
 at 
 org.apache.tez.mapreduce.hadoop.mapreduce.MapContextImpl.nextKeyValue(MapContextImpl.java:81)
 at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor.runNewMapper(MapProcessor.java:237)
 at 
 org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:124)
 at 
 

[jira] [Updated] (TEZ-1612) Pig on tez unit test intermittent hang

2014-09-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated TEZ-1612:

Attachment: runwithmaster.tar.gz

Attach the log directory when run with master. The test complete successfully.

 Pig on tez unit test intermittent hang
 --

 Key: TEZ-1612
 URL: https://issues.apache.org/jira/browse/TEZ-1612
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Daniel Dai
Assignee: Bikas Saha
 Attachments: DAG1.png, runwithmaster.tar.gz, 
 syslog_dag_1411413615885_0001_1, testfail1.log.tar.gz


 Several Pig unit tests hang intermittently. For example, 
 TestNewPlanImplicitSplit.testImplicitSplitInCoGroup, which is a DAG of 4 
 nodes:
 !DAG1.png!
 It uses auto-parallelism, vertex 106 change parallelism from 2-1, and vertex 
 107 from 21-1.
 Log attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-1621) Actual error message not thrown on console, does appear in the YARN application log

2014-09-24 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147296#comment-14147296
 ] 

Jeff Zhang edited comment on TEZ-1621 at 9/25/14 5:38 AM:
--

It is one kind of Exception that cause the TezChild Container shutdown. We 
should report the error to AM before shutting down TezChild

{code}
  } else if (cause instanceof Error) {
LOG.error(Exception of type Error. Exiting now, cause);
ExitUtil.terminate(-1, cause);
  } else {
{code}


was (Author: zjffdu):
It is one kind of Exception that cause the TezChild Container shutdown. We 
should report the error to task before shutting down TezChild

{code}
  } else if (cause instanceof Error) {
LOG.error(Exception of type Error. Exiting now, cause);
ExitUtil.terminate(-1, cause);
  } else {
{code}

 Actual error message not thrown on console, does appear in the YARN 
 application log
 ---

 Key: TEZ-1621
 URL: https://issues.apache.org/jira/browse/TEZ-1621
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Deepesh Khandelwal
 Attachments: app_logs.txt, console.txt


 While running an in session testorderedwordcount example the DAG failed with 
 the following error on the console:
 {noformat}
 14/09/25 01:55:53 INFO examples.TestOrderedWordCount: DAG 1 diagnostics: 
 [Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1411586515507_0110_1_00, diagnostics=[Task failed, 
 taskId=task_1411586515507_0110_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Container container_1411586515507_0110_01_02 finished with 
 diagnostics set to [Container failed. Exception from container-launch.
 Container id: container_1411586515507_0110_01_02
 Exit code: 255
 Stack trace: ExitCodeException exitCode=255:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:290)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This wasn't very helpful, the root cause is in the application log:
 {noformat}
 2014-09-25 01:55:41,246 ERROR [TezChild] 
 org.apache.tez.runtime.task.TezTaskRunner: Exception of type Error. Exiting 
 now
 java.lang.UnsatisfiedLinkError: 
 org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;J)V
 at org.apache.hadoop.util.NativeCrc32.nativeVerifyChunkedSums(Native 
 Method)
 at 
 org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:57)
 at 
 org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:291)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillBuffer(BlockReaderLocal.java:344)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.fillDataBuf(BlockReaderLocal.java:444)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:575)
 at 
 org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:539)
 at 
 org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:683)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:739)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:796)
 at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
 at java.io.DataInputStream.read(DataInputStream.java:100)
 at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
 at 
 org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:149)
 at 
 org.apache.hadoop.mapreduce.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.nextKeyValue(TezGroupedSplitsInputFormat.java:167)
 at 
 org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
 at 
 

[jira] [Created] (TEZ-1622) Implement a tez jar equivalent script to avoid the complexities of hadoop jar

2014-09-24 Thread Gopal V (JIRA)
Gopal V created TEZ-1622:


 Summary: Implement a tez jar equivalent script to avoid the 
complexities of hadoop jar
 Key: TEZ-1622
 URL: https://issues.apache.org/jira/browse/TEZ-1622
 Project: Apache Tez
  Issue Type: Bug
Reporter: Gopal V


Currently, the only way to run a tez job by hand is to setup multiple 
parameters like HADOOP_CLASSPATH and then do hadoop jar {{main-class}}.

This is inconvenient and complex - find an easier way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)