[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504507#comment-14504507
 ] 

Jeff Zhang commented on TEZ-2341:
-

Thanks [~bikassaha], check the log again, it looks like an environment issue. 
winutils.exe is not installed

{code}
2015-04-17 07:23:38,932 ERROR [IPC Server handler 0 on 55747] 
util.WindowsBasedProcessTree 
(WindowsBasedProcessTree.java:getAllProcessInfoFromShell(84)) - 
ExitCodeException exitCode=2: PrintTaskProcessList error (2): The system cannot 
find the file specified.
TaskExit: error (2): The system cannot find the file specified.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at 
org.apache.hadoop.yarn.util.WindowsBasedProcessTree.getAllProcessInfoFromShell(WindowsBasedProcessTree.java:81)
at 
org.apache.hadoop.yarn.util.WindowsBasedProcessTree.updateProcessTree(WindowsBasedProcessTree.java:125)
at 
org.apache.tez.dag.app.DAGAppMaster.getAMCPUTime(DAGAppMaster.java:347)
at 
org.apache.tez.dag.app.DAGAppMaster.access$2800(DAGAppMaster.java:190)
at 
org.apache.tez.dag.app.DAGAppMaster$RunningAppContext.getCumulativeCPUTime(DAGAppMaster.java:1428)
at org.apache.tez.dag.app.dag.impl.DAGImpl.(DAGImpl.java:527)
at org.apache.tez.dag.app.DAGAppMaster.createDAG(DAGAppMaster.java:820)
at org.apache.tez.dag.app.DAGAppMaster.createDAG(DAGAppMaster.java:798)
at org.apache.tez.dag.app.DAGAppMaster.startDAG(DAGAppMaster.java:2030)
at 
org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1147)
at 
org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118)
at 
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163)
at 
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
{code}

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 => 181

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504560#comment-14504560
 ] 

Jeff Zhang commented on TEZ-2322:
-

[~harisekhon] Check the log and find that the dag is finally succeeded. The dag 
status when recovering may be incorrect, this will confuse users which we do 
need to improve that. 

> Succeeded count wrong for Pig on Tez job, decreased 380 => 181
> --
>
> Key: TEZ-2322
> URL: https://issues.apache.org/jira/browse/TEZ-2322
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
> Attachments: attempt1_syslog_dag_1427546104095_0146_1, 
> attempt2_syslog, attempt2_syslog_dag_1427546104095_0146_1, 
> attempt2_syslog_dag_1427546104095_0146_1_post
>
>
> During a Pig on Tez job the number of succeeded tasks dropped from 380 => 181 
> as shown below:
> {code}
> 2015-04-15 15:09:56,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:36,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:56,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:36,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:56,993 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:12:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: 
> 0 
> {code}
> Now this may be because the tasks failed, some certainly did due to space 
> exceptions having checked the logs, but surely once a task has finished 
> successfully and is marked as succeeded it cannot then later be removed from 
> the succeeded count? Perhaps the succeeded counter is incremented too early 
> before the task results are really saved?
> KilledTaskAttempts jumped from 16 => 89 at the same time, but even this 
> doesn't account for the large drop in number of succeeded tasks.
> There was also a noticeable jump in Running tasks from 58 => 724 at the same 
> time which is suspicious, I'm pretty sure there was no contending job to 
> finish and release so much more resource to this Tez job, so it's also 
> unclear how the running count count have jumped up to significantly given the 
> cluster hardware resources have been the same throughout.
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2340:

Summary: TestRecoveryParser fails  (was: TestRecoveryParser fails on 
windows)

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt 
> does not start up, 
> flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred
> 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all 
> events
> 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] 
> recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure 
> occurred. Stopping recovery thread. Current eventQueueSize=0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-2295) Allow to set vertex level info

2015-04-21 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved TEZ-2295.
-
Resolution: Invalid

Could add history text of vertex's processor for vertex level info

> Allow to set vertex level info
> --
>
> Key: TEZ-2295
> URL: https://issues.apache.org/jira/browse/TEZ-2295
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>
> Also need to add doc here http://tez.apache.org/tez_ui_user_data.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1776) TA_CONTAINER_TERMINATING event should not always fail the task attempt

2015-04-21 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504610#comment-14504610
 ] 

Siddharth Seth commented on TEZ-1776:
-

There's a separate transition for NODE_FAILURES, which should reach the 
TaskAttempt before CONTAINER_TERMINATING messages generated by the Container 
state machine. Those put TaskAttempts into a KILLED state.

Are there other scenarios where you're seeing tasks failing when they should be 
killed.

> TA_CONTAINER_TERMINATING event should not always fail the task attempt
> --
>
> Key: TEZ-1776
> URL: https://issues.apache.org/jira/browse/TEZ-1776
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
>
> This is sometime sent when the node fails or other non-task related container 
> failures. For those cases the attempt should transition to killed instead of 
> failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2340:

Attachment: TEZ-2340-1.patch

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2340-1.patch
>
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt 
> does not start up, 
> flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred
> 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all 
> events
> 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] 
> recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure 
> occurred. Stopping recovery thread. Current eventQueueSize=0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated TEZ-2348:

Attachment: _tez_session_dir.tgz

Ran the query through the debugger and found a directory 
/tmp/hive//_tez_session_dir/, would that have the right files? I copied 
the contents right after hitting the error, attaching here.

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
> Attachments: _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 => 181

2015-04-21 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504694#comment-14504694
 ] 

Hari Sekhon commented on TEZ-2322:
--

Hitesh Shah, the yarn logs command failed originally otherwise I would have 
supplied that output.

Jeff Zhang I did note the job did succeed in the end - this is just a jira to 
mark that the counts were wrong, hence I've labelled this as minor priority to 
fix.

> Succeeded count wrong for Pig on Tez job, decreased 380 => 181
> --
>
> Key: TEZ-2322
> URL: https://issues.apache.org/jira/browse/TEZ-2322
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
> Attachments: attempt1_syslog_dag_1427546104095_0146_1, 
> attempt2_syslog, attempt2_syslog_dag_1427546104095_0146_1, 
> attempt2_syslog_dag_1427546104095_0146_1_post
>
>
> During a Pig on Tez job the number of succeeded tasks dropped from 380 => 181 
> as shown below:
> {code}
> 2015-04-15 15:09:56,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:36,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:56,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:36,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:56,993 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:12:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: 
> 0 
> {code}
> Now this may be because the tasks failed, some certainly did due to space 
> exceptions having checked the logs, but surely once a task has finished 
> successfully and is marked as succeeded it cannot then later be removed from 
> the succeeded count? Perhaps the succeeded counter is incremented too early 
> before the task results are really saved?
> KilledTaskAttempts jumped from 16 => 89 at the same time, but even this 
> doesn't account for the large drop in number of succeeded tasks.
> There was also a noticeable jump in Running tasks from 58 => 724 at the same 
> time which is suspicious, I'm pretty sure there was no contending job to 
> finish and release so much more resource to this Tez job, so it's also 
> unclear how the running count count have jumped up to significantly given the 
> cluster hardware resources have been the same throughout.
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2344) TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table

2015-04-21 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504729#comment-14504729
 ] 

Sreenath Somarajapuram commented on TEZ-2344:
-

All templates except basic-cell supports bounded values and are capable of 
displaying dynamic data. Its just that the getCellContent function must be 
equipped to delegate changes.


> TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all 
> DAGs table
> 
>
> Key: TEZ-2344
> URL: https://issues.apache.org/jira/browse/TEZ-2344
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2344.1.patch
>
>
> 1. Must handle promises, objects and primitive data types.
> 2. Must be generic
> 3. Display waiting animation or Not Availabe! messages when required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504739#comment-14504739
 ] 

Jeff Zhang commented on TEZ-2340:
-

The root cause of the test failure is that all the testcases use the directory 
for recovery so that the delete operation may fails because the last test case 
may not close the file stream. Attach the patch to use different recovery path 
for each test case 

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2340-1.patch
>
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt 
> does not start up, 
> flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred
> 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all 
> events
> 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] 
> recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure 
> occurred. Stopping recovery thread. Current eventQueueSize=0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504739#comment-14504739
 ] 

Jeff Zhang edited comment on TEZ-2340 at 4/21/15 10:34 AM:
---

The root cause of the test failure is that all the testcases use the directory 
for recovery so that the delete operation may fails because the last test case 
may not close the file stream. Attach the patch to use different recovery path 
for each test case.

{code}
2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
(FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
[D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
 it still exists.
2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
(FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
[D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
 it still exists.
{code}


was (Author: zjffdu):
The root cause of the test failure is that all the testcases use the directory 
for recovery so that the delete operation may fails because the last test case 
may not close the file stream. Attach the patch to use different recovery path 
for each test case 

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2340-1.patch
>
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt 
> does not start up, 
> flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred
> 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(323)) - Recove

[jira] [Comment Edited] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504739#comment-14504739
 ] 

Jeff Zhang edited comment on TEZ-2340 at 4/21/15 10:34 AM:
---

The root cause of the test failure is that all the testcases use the same 
directory for recovery so that the delete operation may fails because the last 
test case may not close the file stream. Attach the patch to use different 
recovery path for each test case.

{code}
2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
(FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
[D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
 it still exists.
2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
(FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
[D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
 it still exists.
{code}


was (Author: zjffdu):
The root cause of the test failure is that all the testcases use the directory 
for recovery so that the delete operation may fails because the last test case 
may not close the file stream. Attach the patch to use different recovery path 
for each test case.

{code}
2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
(FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
[D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
 it still exists.
2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
(FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
[D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
 it still exists.
{code}

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2340-1.patch
>
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.ju

[jira] [Commented] (TEZ-2308) Add set/get of record counts in task/vertex statistics

2015-04-21 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504741#comment-14504741
 ] 

Rajesh Balamohan commented on TEZ-2308:
---

lgtm. +1. 

> Add set/get of record counts in task/vertex statistics
> --
>
> Key: TEZ-2308
> URL: https://issues.apache.org/jira/browse/TEZ-2308
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2308.1.patch
>
>
> In addition to data size, getting record count would be useful. /cc [~rohini]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2344) Tez UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table

2015-04-21 Thread Prakash Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Ramachandran updated TEZ-2344:
--
Summary: Tez UI: Equip basic-ember-table's cell level loading for all use 
cases in all DAGs table  (was: TEZ-UI: Equip basic-ember-table's cell level 
loading for all use cases in all DAGs table)

> Tez UI: Equip basic-ember-table's cell level loading for all use cases in all 
> DAGs table
> 
>
> Key: TEZ-2344
> URL: https://issues.apache.org/jira/browse/TEZ-2344
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2344.1.patch
>
>
> 1. Must handle promises, objects and primitive data types.
> 2. Must be generic
> 3. Display waiting animation or Not Availabe! messages when required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2338) Tez job failed due to AM Container-Launch failure at windows

2015-04-21 Thread Kaveen Raajan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504776#comment-14504776
 ] 

Kaveen Raajan commented on TEZ-2338:


Hi [~hitesh]
Thanks for the update :), we tried by adding this in our yarn-site.xml.
{code:xml}

  yarn.nodemanager.delete.debug-delay-sec
  1200

{code}
 We noticed one thing while running the *launch-container.cmd* located in 
hadoop _\tmp\..\appcache_ location. It arises an issue in accessing the *.dll* 
for running mapreduce on windows platform, ie. MSVCR100.dll message box was 
thrown while handling TEZ job.
*Error Message:*
{quote}"The program can't start because MSCVR100.dll is missing from your 
computer. Try reinstalling the program to fix this issue"{quote}
But we installed framework-4.5 in that NM node, and we also find MSVCR100.dll 
at C:\Windows\System32\ location. Even though we face same issue.
*Fix we tried:*
Then we downloaded dll-file fixer 
[download|http://download.dll-files.com/fixer/filest/dff_fdp2-msvcr100.exe] and 
reinstalled MSVCR100.dll file in NM machine.
After that we tried mapreduce program for TEZ job got submitted and completed 
successfully and No ISSUE occured

Is this a proper fix for the above Exception and what the reason for this 
Exception?

> Tez job failed due to AM Container-Launch failure at windows
> 
>
> Key: TEZ-2338
> URL: https://issues.apache.org/jira/browse/TEZ-2338
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Windows server 2012 and Windows-8
> Hadoop-2.5.2
> Java-1.7
>Reporter: Kaveen Raajan
>
> I successfully Build Tez-0.6.0 against Hadoop-2.5.2
> Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html
> Moved Tez lib package to HDFS location and updated my tez-site.xml
> {code:xml}
>  
> tez.lib.uris
> ${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib/
>   
> {code}
> After that I tried the sample test for tez
> _hadoop jar tez-examples-0.6.0.jar orderedwordcount  _
> But I face following error while running this command
> *Note:* I'm using HADOOP High Availability setup.
> {code}
> Running OrderedWordCount
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/C:/Hadoop/
> share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind
> er.class]
> SLF4J: Found binding in [jar:file:/C:/Tez/lib
> /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ 
> component=tez-api
> , version=0.6.0, revision=${buildNumber}, 
> SCM-URL=scm:git:https://git-wip-us.apa
> che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ]
> 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: 
> app
> lication_1429073725727_0005
> 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is 
> deprecated.
>  Instead, use fs.defaultFS
> 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from 
> conf
> iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/
> 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta
> ging doesn't exist and is created
> 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory 
> hdfs://HACluster
> /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex
> ist and is created
> 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, 
> applicationId=a
> pplication_1429073725727_0005, dagName=OrderedWordCount
> 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application 
> application_14
> 29073725727_0005
> 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: 
> http://MASTER_NN1:8088/proxy/application_1429073725727_0005/
> 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running
> 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
> OrderedWordCount failed with diagnostics: [Application 
> application_1429073725727
> _0005 failed 2 times due to AM Container for 
> appattempt_1429073725727_0005_0
> 2 exited with  exitCode: -1073741515 due to: Exception from container-launch: 
> Ex
> itCodeException exitCode=-1073741515:
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
> 702)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
> unchContainer(DefaultContainerExecutor.java:195)
>

[jira] [Commented] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504786#comment-14504786
 ] 

TezQA commented on TEZ-2340:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726829/TEZ-2340-1.patch
  against master revision decb419.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/501//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/501//console

This message is automatically generated.

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2340-1.patch
>
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt 
> does not start up, 
> flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred
> 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.jav

Success: TEZ-2340 PreCommit Build #501

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2340
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/501/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2765 lines...]
[INFO] Final Memory: 70M/948M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726829/TEZ-2340-1.patch
  against master revision decb419.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/501//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/501//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
4d7861aaf92c41cac0d6379b502fa4770e4c5275 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #499
Archived 44 artifacts
Archive block size is 32768
Received 26 blocks and 1907931 bytes
Compression is 30.9%
Took 1.3 sec
Description set: TEZ-2340
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2346) TEZ-UI: Load other info / counter data on demand

2015-04-21 Thread Prakash Ramachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504809#comment-14504809
 ] 

Prakash Ramachandran commented on TEZ-2346:
---

[~Sreenath]
bq. In the above scenario am trying to filer based on status and applicationId. 
As both of them are available in primaryfilters, am not sure why we are 
depended on otherinfo.

ats allows only one primaryfilter. so in tez ui, if more than one filter is 
specified as primary it is moved to secondaryfilter (status, dagname etc are 
present in both and timeline checks in both otherinfo and primaryfilter for the 
same - see getFilterProperties in paginated_content.js for the setting of 
filters from UI). ats first filters by primary and then by secondary.

also the status is updated in primaryFilters only after the dag finishes.

regarding the exception - I believe the following code causes the issue (since 
no otherinfo is specified in the fields entity.getOtherInfo will be null and 
the get will cause NPE).

{code:title=LeveldbTimelineStore.java}
   if (fields.contains(Field.OTHER_INFO)) {
  otherInfo = true;
} else {
  entity.setOtherInfo(null);
}

...
...
 public void setOtherInfo(Map otherInfo) {
if (otherInfo != null && !(otherInfo instanceof HashMap)) {
  this.otherInfo = new HashMap(otherInfo);
} else {
  this.otherInfo = (HashMap) otherInfo;
}
  }
{code}

{code:title=LeveldbTimelineStore.java}
if (secondaryFilters != null) {
  for (NameValuePair filter : secondaryFilters) {
Object v = entity.getOtherInfo().get(filter.getName());
{code}


> TEZ-UI: Load other info / counter data on demand
> 
>
> Key: TEZ-2346
> URL: https://issues.apache.org/jira/browse/TEZ-2346
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: Screen-Shot-2015-04-21-at-1.56.28-AM.jpg, 
> TEZ-2346.wip.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table

2015-04-21 Thread Prakash Ramachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504837#comment-14504837
 ] 

Prakash Ramachandran commented on TEZ-2345:
---

[~sreenathmenon]
few general ones
* when no data is available , the table should not be hidden, and instead 
should be shown with no rows (ex. search for running dags when none are 
running).


also ran into a couple of errors with the patch mostly looks like a race 
condition.
* the below was caused by a store.find('appDetail', appId), and appId being 
null. this looks like a serialization issue
bq. Uncaught Error: Assertion Failed: You may not pass `undefined` as id to the 
store's find methodember.js:3722 Ember.assertember-data.js:10457 
Ember.Object.extend.findcombined-scripts.js:5613 getCellContent





> TEZ-UI: Enable cell level loading in all DAGs table
> ---
>
> Key: TEZ-2345
> URL: https://issues.apache.org/jira/browse/TEZ-2345
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2345.1.patch
>
>
> - Enable cell level loading in all DAGs table using basic-ember-table 
> component.
> - Re-arrange UI element into make it similar to other tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table

2015-04-21 Thread Prakash Ramachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504837#comment-14504837
 ] 

Prakash Ramachandran edited comment on TEZ-2345 at 4/21/15 11:59 AM:
-

[~Sreenath]
few general ones
* when no data is available , the table should not be hidden, and instead 
should be shown with no rows (ex. search for running dags when none are 
running).


also ran into a couple of errors with the patch mostly looks like a race 
condition.
* the below was caused by a store.find('appDetail', appId), and appId being 
null. this looks like a serialization issue
bq. Uncaught Error: Assertion Failed: You may not pass `undefined` as id to the 
store's find methodember.js:3722 Ember.assertember-data.js:10457 
Ember.Object.extend.findcombined-scripts.js:5613 getCellContent






was (Author: pramachandran):
[~sreenathmenon]
few general ones
* when no data is available , the table should not be hidden, and instead 
should be shown with no rows (ex. search for running dags when none are 
running).


also ran into a couple of errors with the patch mostly looks like a race 
condition.
* the below was caused by a store.find('appDetail', appId), and appId being 
null. this looks like a serialization issue
bq. Uncaught Error: Assertion Failed: You may not pass `undefined` as id to the 
store's find methodember.js:3722 Ember.assertember-data.js:10457 
Ember.Object.extend.findcombined-scripts.js:5613 getCellContent





> TEZ-UI: Enable cell level loading in all DAGs table
> ---
>
> Key: TEZ-2345
> URL: https://issues.apache.org/jira/browse/TEZ-2345
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2345.1.patch
>
>
> - Enable cell level loading in all DAGs table using basic-ember-table 
> component.
> - Re-arrange UI element into make it similar to other tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2325) Route status update event directly to the attempt

2015-04-21 Thread Prakash Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Ramachandran reassigned TEZ-2325:
-

Assignee: Prakash Ramachandran

> Route status update event directly to the attempt 
> --
>
> Key: TEZ-2325
> URL: https://issues.apache.org/jira/browse/TEZ-2325
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Prakash Ramachandran
>
> Today, all events from the attempt heartbeat are routed to the vertex. then 
> the vertex routes (if any) status update events to the attempt. This is 
> unnecessary and potentially creates out of order scenarios. We could route 
> the status update events directly to attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 => 181

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505200#comment-14505200
 ] 

Hitesh Shah commented on TEZ-2322:
--

[~harisekhon] Based on the fact that the AM did crash and recover, the 
succeeded task count can go down between the 2 attempts. The reason for this is 
that we do not checkpoint/sync the state for each task completion but only at 
certain points ( for performance reasons as AM crashes are rare). For the most 
part, most tasks are recovered but in certain situations some tasks end up 
getting re-run if the recovery/state log had a lag. 

I think the succeeded count going down is fine but for the tasks could not be 
recovered in the second attempt, the failed attempt count should have been 
increased accordingly. 

> Succeeded count wrong for Pig on Tez job, decreased 380 => 181
> --
>
> Key: TEZ-2322
> URL: https://issues.apache.org/jira/browse/TEZ-2322
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
> Attachments: attempt1_syslog_dag_1427546104095_0146_1, 
> attempt2_syslog, attempt2_syslog_dag_1427546104095_0146_1, 
> attempt2_syslog_dag_1427546104095_0146_1_post
>
>
> During a Pig on Tez job the number of succeeded tasks dropped from 380 => 181 
> as shown below:
> {code}
> 2015-04-15 15:09:56,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:36,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 
> Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics=
> 2015-04-15 15:10:56,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:36,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:11:56,993 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: 
> 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics=
> 2015-04-15 15:12:16,992 [Timer-0] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: 
> status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: 
> 0 
> {code}
> Now this may be because the tasks failed, some certainly did due to space 
> exceptions having checked the logs, but surely once a task has finished 
> successfully and is marked as succeeded it cannot then later be removed from 
> the succeeded count? Perhaps the succeeded counter is incremented too early 
> before the task results are really saved?
> KilledTaskAttempts jumped from 16 => 89 at the same time, but even this 
> doesn't account for the large drop in number of succeeded tasks.
> There was also a noticeable jump in Running tasks from 58 => 724 at the same 
> time which is suspicious, I'm pretty sure there was no contending job to 
> finish and release so much more resource to this Tez job, so it's also 
> unclear how the running count count have jumped up to significantly given the 
> cluster hardware resources have been the same throughout.
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2346) TEZ-UI: Load other info / counter data on demand

2015-04-21 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505243#comment-14505243
 ] 

Zhijie Shen commented on TEZ-2346:
--

bq. I believe the following code causes the issue

Right, I think so too.

> TEZ-UI: Load other info / counter data on demand
> 
>
> Key: TEZ-2346
> URL: https://issues.apache.org/jira/browse/TEZ-2346
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: Screen-Shot-2015-04-21-at-1.56.28-AM.jpg, 
> TEZ-2346.wip.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505400#comment-14505400
 ] 

Hitesh Shah commented on TEZ-2341:
--

Should we close this out as wont-fix if this is an env issue? 

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505414#comment-14505414
 ] 

Hitesh Shah commented on TEZ-2292:
--

If a user plugin is invoked, we should be catching Exception in any case and 
not just TezException unless the framework layer is re-wrapping the user code 
exception. 

> Add e2e test for error reporting when vertex manager invokes plugin APIs
> 
>
> Key: TEZ-2292
> URL: https://issues.apache.org/jira/browse/TEZ-2292
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: TEZ-2292.1.patch
>
>
> If the Vertex Manager has an error or cannot apply a required reconfiguration 
> then it should be allowed to fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505423#comment-14505423
 ] 

Hitesh Shah commented on TEZ-2292:
--

bq. Further, if users catch and swallow these exceptions then it may lead to 
the state machines being left behind in an invalid state and weird errors down 
the line. This is probably why originally the exceptions were all unchecked.

In some cases, this may be true but it all depends on whether the user plugin 
can do anything to recover from its error. In any case, a plugin could be 
catching all RuntimeExceptions too which leads us back to the same problem. The 
issue holds whether there is a checked or an unchecked exception. I am not sure 
we can do much if there is a "bad" vertex manager plugin. 



  

> Add e2e test for error reporting when vertex manager invokes plugin APIs
> 
>
> Key: TEZ-2292
> URL: https://issues.apache.org/jira/browse/TEZ-2292
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: TEZ-2292.1.patch
>
>
> If the Vertex Manager has an error or cannot apply a required reconfiguration 
> then it should be allowed to fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505426#comment-14505426
 ] 

Hitesh Shah commented on TEZ-2292:
--

For this patch, minor comment: 

{code}
} catch (IOException e) {
e.printStackTrace();
 }
{code}
   - maybe re-throw this back as a runtime exception instead of ignoring it? 

+1 once the above is fixed. 




> Add e2e test for error reporting when vertex manager invokes plugin APIs
> 
>
> Key: TEZ-2292
> URL: https://issues.apache.org/jira/browse/TEZ-2292
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: TEZ-2292.1.patch
>
>
> If the Vertex Manager has an error or cannot apply a required reconfiguration 
> then it should be allowed to fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2351) Remove GroupByOrderbyMRRTest example from tez-tests

2015-04-21 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2351:


 Summary: Remove GroupByOrderbyMRRTest example from tez-tests
 Key: TEZ-2351
 URL: https://issues.apache.org/jira/browse/TEZ-2351
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Minor


Not really used in any tests and it is just maintenance overhead at this point. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505414#comment-14505414
 ] 

Hitesh Shah edited comment on TEZ-2292 at 4/21/15 6:14 PM:
---

If a user plugin is invoked, we should be catching Exception in any case and 
not just TezException unless the framework layer is re-wrapping the user code 
exception. 

In this scenario, it is the framework code that is throwing an exception back 
to the user-code. In that scenario, we should be declaring  a throws exception 
on the re-configure api ( though adding a new exception to an existing api is 
incompatible except for certain cases ). 


was (Author: hitesh):
If a user plugin is invoked, we should be catching Exception in any case and 
not just TezException unless the framework layer is re-wrapping the user code 
exception. 

> Add e2e test for error reporting when vertex manager invokes plugin APIs
> 
>
> Key: TEZ-2292
> URL: https://issues.apache.org/jira/browse/TEZ-2292
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: TEZ-2292.1.patch
>
>
> If the Vertex Manager has an error or cannot apply a required reconfiguration 
> then it should be allowed to fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2351) Remove GroupByOrderbyMRRTest example from tez-tests

2015-04-21 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned TEZ-2351:


Assignee: Hitesh Shah

> Remove GroupByOrderbyMRRTest example from tez-tests
> ---
>
> Key: TEZ-2351
> URL: https://issues.apache.org/jira/browse/TEZ-2351
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
>
> Not really used in any tests and it is just maintenance overhead at this 
> point. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2351) Remove GroupByOrderbyMRRTest example from tez-tests

2015-04-21 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2351:
-
Target Version/s: 0.7.0  (was: 0.8.0)

> Remove GroupByOrderbyMRRTest example from tez-tests
> ---
>
> Key: TEZ-2351
> URL: https://issues.apache.org/jira/browse/TEZ-2351
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
>
> Not really used in any tests and it is just maintenance overhead at this 
> point. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table

2015-04-21 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2345:

Attachment: TEZ-2345.2.patch

Thanks [~pramachandran].
Please find a fresh patch with the comments addressed.
1. Table also would be displayed when no data is available.
2. Have added checks to handle the race condition.

> TEZ-UI: Enable cell level loading in all DAGs table
> ---
>
> Key: TEZ-2345
> URL: https://issues.apache.org/jira/browse/TEZ-2345
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2345.1.patch, TEZ-2345.2.patch
>
>
> - Enable cell level loading in all DAGs table using basic-ember-table 
> component.
> - Re-arrange UI element into make it similar to other tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2345 PreCommit Build #502

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2345
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/502/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2776 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726958/TEZ-2345.2.patch
  against master revision 87aac12.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/502//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/502//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
1a9380019c582d9713d1ca0493b71dac418297e3 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #501
Archived 44 artifacts
Archive block size is 32768
Received 8 blocks and 2501843 bytes
Compression is 9.5%
Took 1.4 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505601#comment-14505601
 ] 

TezQA commented on TEZ-2345:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726958/TEZ-2345.2.patch
  against master revision 87aac12.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/502//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/502//console

This message is automatically generated.

> TEZ-UI: Enable cell level loading in all DAGs table
> ---
>
> Key: TEZ-2345
> URL: https://issues.apache.org/jira/browse/TEZ-2345
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2345.1.patch, TEZ-2345.2.patch
>
>
> - Enable cell level loading in all DAGs table using basic-ember-table 
> component.
> - Re-arrange UI element into make it similar to other tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505655#comment-14505655
 ] 

Bikas Saha commented on TEZ-2341:
-

Rethinking, its probably ok to ignore this in the test since we are running 
this in a simulation as part of the unit test and that will not have winutils 
deployed. So we cannot use winutils. Alternatively, we could configure 
TezMxBeanResourceCalculator as the resource calculator plugin class but 
probably the project dependencies will need to be tweaked for that.

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2338) Tez job failed due to AM Container-Launch failure at windows

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505662#comment-14505662
 ] 

Bikas Saha commented on TEZ-2338:
-

Unfortunately, this is not the correct forum for this question as we are not 
the experts on Windows Hadoop installation issues. You can send a summary of 
your problem and your workaround/fix to u...@hadoop.apache.org and some windows 
experts there may be able to answer. Since this is not related to Tez, could 
you please close this jira? Thanks!

> Tez job failed due to AM Container-Launch failure at windows
> 
>
> Key: TEZ-2338
> URL: https://issues.apache.org/jira/browse/TEZ-2338
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Windows server 2012 and Windows-8
> Hadoop-2.5.2
> Java-1.7
>Reporter: Kaveen Raajan
>
> I successfully Build Tez-0.6.0 against Hadoop-2.5.2
> Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html
> Moved Tez lib package to HDFS location and updated my tez-site.xml
> {code:xml}
>  
> tez.lib.uris
> ${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib/
>   
> {code}
> After that I tried the sample test for tez
> _hadoop jar tez-examples-0.6.0.jar orderedwordcount  _
> But I face following error while running this command
> *Note:* I'm using HADOOP High Availability setup.
> {code}
> Running OrderedWordCount
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/C:/Hadoop/
> share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind
> er.class]
> SLF4J: Found binding in [jar:file:/C:/Tez/lib
> /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ 
> component=tez-api
> , version=0.6.0, revision=${buildNumber}, 
> SCM-URL=scm:git:https://git-wip-us.apa
> che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ]
> 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: 
> app
> lication_1429073725727_0005
> 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is 
> deprecated.
>  Instead, use fs.defaultFS
> 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from 
> conf
> iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/
> 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta
> ging doesn't exist and is created
> 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory 
> hdfs://HACluster
> /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex
> ist and is created
> 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, 
> applicationId=a
> pplication_1429073725727_0005, dagName=OrderedWordCount
> 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application 
> application_14
> 29073725727_0005
> 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: 
> http://MASTER_NN1:8088/proxy/application_1429073725727_0005/
> 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running
> 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
> OrderedWordCount failed with diagnostics: [Application 
> application_1429073725727
> _0005 failed 2 times due to AM Container for 
> appattempt_1429073725727_0005_0
> 2 exited with  exitCode: -1073741515 due to: Exception from container-launch: 
> Ex
> itCodeException exitCode=-1073741515:
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
> 702)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
> unchContainer(DefaultContainerExecutor.java:195)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
> ontainerLaunch.call(ContainerLaunch.java:300)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
> ontainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
> java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:615)
> at java.lang.Thread.run(Thread.java:744)
> 1 file(s) moved.
> Container exited with a non-zero exit code -1073741515
> .Failing this attempt.. Failing the application.]
> {code}
> While Seeing at Resourcemanager log:
> 

[jira] [Commented] (TEZ-1776) TA_CONTAINER_TERMINATING event should not always fail the task attempt

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505680#comment-14505680
 ] 

Bikas Saha commented on TEZ-1776:
-

IMO a test case will show whats really the sequence of events. If there is an 
issue we need a fix, if there isnt one then we need to fix the current 
transition logic for these events.

> TA_CONTAINER_TERMINATING event should not always fail the task attempt
> --
>
> Key: TEZ-1776
> URL: https://issues.apache.org/jira/browse/TEZ-1776
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
>
> This is sometime sent when the node fails or other non-task related container 
> failures. For those cases the attempt should transition to killed instead of 
> failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reopened TEZ-2341:
--

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505708#comment-14505708
 ] 

Hitesh Shah edited comment on TEZ-2341 at 4/21/15 8:40 PM:
---

The patch probably needs to change to check "windows" and not just "win". The 
package dependencies change is non-trivial so better to change the test to 
ignore the check.


was (Author: hitesh):
The patch probably needs to change to check "windows" and not just "win". 

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505708#comment-14505708
 ] 

Hitesh Shah commented on TEZ-2341:
--

The patch probably needs to change to check "windows" and not just "win". 

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505726#comment-14505726
 ] 

Bikas Saha commented on TEZ-2341:
-

Perhaps just allow for "Linux" so we dont have to do this for every new OS.

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs

2015-04-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2292:

Attachment: TEZ-2292.2.patch

Attaching patch that fixes the review comments and also removes TezException 
from the signature. For now that seems like the prudent thing to do given that 
reconfiguration failure is almost never an optional event. Later when we 
have/support cases where reconfiguration failure is ok then we can create a 
maybeReconfigureVertex() that allows for it - per offline discussion with 
[~hitesh].

> Add e2e test for error reporting when vertex manager invokes plugin APIs
> 
>
> Key: TEZ-2292
> URL: https://issues.apache.org/jira/browse/TEZ-2292
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: TEZ-2292.1.patch, TEZ-2292.2.patch
>
>
> If the Vertex Manager has an error or cannot apply a required reconfiguration 
> then it should be allowed to fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2330:

Attachment: TEZ-2330.2.patch

Per discussion in TEZ-2292 removing TezException from the signature. Thanks for 
the review. Will wait for another clean Jenkins run before committing.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505835#comment-14505835
 ] 

TezQA commented on TEZ-2330:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726998/TEZ-2330.2.patch
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/504//console

This message is automatically generated.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2330 PreCommit Build #504

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2330
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/504/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 90 lines...]


==
==
Determining number of patched javac warnings.
==
==


/home/jenkins/tools/maven/latest/bin/mvn clean test -DskipTests -Ptest-patch > 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/patchJavacWarnings.txt
 2>&1




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726998/TEZ-2330.2.patch
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/504//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
64ac4c08483002d810ed05b12d11d82f6c3d1def logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #501
Archived 3 artifacts
Archive block size is 32768
Received 0 blocks and 791005 bytes
Compression is 0.0%
Took 0.59 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2330:

Attachment: (was: TEZ-2330.2.patch)

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2330:

Attachment: TEZ-2330.2.patch

Removing bad diff file. Attaching verified diff file.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2352:
---

 Summary: Move getTaskStatistics into the RuntimeTask class
 Key: TEZ-2352
 URL: https://issues.apache.org/jira/browse/TEZ-2352
 Project: Apache Tez
  Issue Type: Task
Reporter: Siddharth Seth
Assignee: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2352:

Attachment: TEZ-2352.1.txt

Simple patch to move this into RuntimeTask, so that TaskReporter relies on 
RuntimeTask instead of the actual implementation.

> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505863#comment-14505863
 ] 

Siddharth Seth commented on TEZ-2352:
-

[~bikassaha], [~rajesh.balamohan] - review please.

> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2292 PreCommit Build #503

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2292
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/503/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2774 lines...]
[INFO] Final Memory: 70M/945M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726994/TEZ-2292.2.patch
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/503//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/503//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
acef431a5ac778003f2d56428bf296a71b911af9 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #501
Archived 44 artifacts
Archive block size is 32768
Received 4 blocks and 2624892 bytes
Compression is 4.8%
Took 0.64 sec
Description set: TEZ-2292
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505900#comment-14505900
 ] 

TezQA commented on TEZ-2292:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12726994/TEZ-2292.2.patch
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/503//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/503//console

This message is automatically generated.

> Add e2e test for error reporting when vertex manager invokes plugin APIs
> 
>
> Key: TEZ-2292
> URL: https://issues.apache.org/jira/browse/TEZ-2292
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: TEZ-2292.1.patch, TEZ-2292.2.patch
>
>
> If the Vertex Manager has an error or cannot apply a required reconfiguration 
> then it should be allowed to fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2330 PreCommit Build #505

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2330
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/505/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2772 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727003/TEZ-2330.2.patch
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 175 javac 
compiler warnings (more than the master's current 174 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/505//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/505//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
b2149840de3a95d0490ab745f0e9a98e621eada0 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #503
Archived 45 artifacts
Archive block size is 32768
Received 4 blocks and 2633450 bytes
Compression is 4.7%
Took 1.4 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505935#comment-14505935
 ] 

TezQA commented on TEZ-2330:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727003/TEZ-2330.2.patch
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 175 javac 
compiler warnings (more than the master's current 174 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/505//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/505//console

This message is automatically generated.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2352 PreCommit Build #506

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2352
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/506/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2771 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727004/TEZ-2352.1.txt
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/506//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/506//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
0fedefee7360c15ba6deeb223129c2c8c138573a logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #503
Archived 44 artifacts
Archive block size is 32768
Received 8 blocks and 2495961 bytes
Compression is 9.5%
Took 0.58 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Created] (TEZ-2353) Javadoc warnings in master

2015-04-21 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2353:


 Summary: Javadoc warnings in master 
 Key: TEZ-2353
 URL: https://issues.apache.org/jira/browse/TEZ-2353
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Minor


[WARNING] 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/tez-examples/src/main/java/org/apache/tez/examples/SortMergeJoinExample.java:144:
 warning - @return tag has no arguments.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505948#comment-14505948
 ] 

TezQA commented on TEZ-2352:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727004/TEZ-2352.1.txt
  against master revision f46997a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/506//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/506//console

This message is automatically generated.

> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2354) Code does not use TezRuntimeConfiguration#TEZ_RUNTIME_IO_FILE_BUFFER_SIZE

2015-04-21 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2354:


 Summary: Code does not use 
TezRuntimeConfiguration#TEZ_RUNTIME_IO_FILE_BUFFER_SIZE
 Key: TEZ-2354
 URL: https://issues.apache.org/jira/browse/TEZ-2354
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Siddharth Seth


Multiple occurrences exist for:

{code}
conf.getInt("io.file.buffer.size", 
TezRuntimeConfiguration.TEZ_RUNTIME_IFILE_BUFFER_SIZE_DEFAULT);
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2355) TezRuntimeConfiguration inconsistencies in field names

2015-04-21 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2355:


 Summary: TezRuntimeConfiguration inconsistencies in field names
 Key: TEZ-2355
 URL: https://issues.apache.org/jira/browse/TEZ-2355
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Rajesh Balamohan


TEZ_RUNTIME_INPUT_BUFFER_PERCENT_DEFAULT compared to 
TEZ_RUNTIME_INPUT_POST_MERGE_BUFFER_PERCENT

TEZ_RUNTIME_SHUFFLE_STALLED_COPY_TIMEOUT_DEFAULT compared to 
TEZ_RUNTIME_SHUFFLE_CONNECT_TIMEOUT

Given that this is a public api, we will need to deprecate the inconsistent 
names and not remove them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506025#comment-14506025
 ] 

Bikas Saha commented on TEZ-2352:
-

Shouldnt the member variable also move up into RuntimeTask? Dont think that 
should be duplicated in a derivation of RuntimeTask. Similar to counters.

Though when I was making the change I think I hit some issue with having this 
in RuntimeTask which I cannot recall now. So I moved it to LIORuntimeTask. If 
this works (including moving the taskStatistics member variable up to 
RuntimeTask) then perhaps I had misunderstood something else that could have 
led to the issue.

@Override in LIORuntimeTask


> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2352:

Attachment: TEZ-2352.2.txt

Updated with stats moved to RuntimeTask.

> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt, TEZ-2352.2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2330:

Attachment: TEZ-2330.3.patch

Rebasing after recent commits.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch, TEZ-2330.3.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506177#comment-14506177
 ] 

Bikas Saha commented on TEZ-2352:
-

lgtm. there may be an unused import for taskstatistics in lioruntimetask.

> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt, TEZ-2352.2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2327) NPE in shuffle

2015-04-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506193#comment-14506193
 ] 

Sergey Shelukhin commented on TEZ-2327:
---

[~sseth] I hit this again... in q17, it happens for me every few runs (that's 
the one with 2x 1009 reducers)

> NPE in shuffle
> --
>
> Key: TEZ-2327
> URL: https://issues.apache.org/jira/browse/TEZ-2327
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] 
> history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, 
> startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task:java.lang.NullPointerException
>at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395)
>at java.io.BufferedInputStream.close(BufferedInputStream.java:483)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629)
>at 
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759)
>at 
> org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This caused the task in question to fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506234#comment-14506234
 ] 

TezQA commented on TEZ-2352:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727060/TEZ-2352.2.txt
  against master revision c6e400e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/507//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/507//console

This message is automatically generated.

> Move getTaskStatistics into the RuntimeTask class
> -
>
> Key: TEZ-2352
> URL: https://issues.apache.org/jira/browse/TEZ-2352
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2352.1.txt, TEZ-2352.2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2352 PreCommit Build #507

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2352
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/507/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1232 lines...]

  Running tests 
  /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -DTezPatchProcess
cat: 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt:
 No such file or directory
awk: cannot open 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt
 (No such file or directory)




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727060/TEZ-2352.2.txt
  against master revision c6e400e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/507//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/507//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
04e6074a66f66be609643b903ad10ee55bfd4f24 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

Failed: TEZ-2330 PreCommit Build #508

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2330
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/508/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2774 lines...]




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727080/TEZ-2330.3.patch
  against master revision c6e400e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 160 javac 
compiler warnings (more than the master's current 159 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/508//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/508//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/508//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
251ae3b565fdb2ef353d9ad85cf61ac9d982b847 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #503
Archived 45 artifacts
Archive block size is 32768
Received 4 blocks and 2618264 bytes
Compression is 4.8%
Took 0.7 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506252#comment-14506252
 ] 

TezQA commented on TEZ-2330:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727080/TEZ-2330.3.patch
  against master revision c6e400e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 160 javac 
compiler warnings (more than the master's current 159 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/508//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/508//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/508//console

This message is automatically generated.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch, TEZ-2330.3.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization

2015-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506261#comment-14506261
 ] 

Bikas Saha commented on TEZ-2330:
-

The javac warning seems to be unrelated to this patch as it does not touch 
getTaskContainer
77a78
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexManager.java:[298,34]
>  [deprecation] getTaskContainer(String,Integer) in VertexManagerPluginContext 
> has been deprecated

Committing in a bit.

> Create reconfigureVertex() API for input based initialization 
> --
>
> Key: TEZ-2330
> URL: https://issues.apache.org/jira/browse/TEZ-2330
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch, TEZ-2330.3.patch
>
>
> TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change 
> parallelism of a vertex. Adding a variant to do the same for input 
> initialization based parallelism change would allow us to deprecate the older 
> overloaded setParallelism() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2327) NPE in shuffle

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506326#comment-14506326
 ] 

Hitesh Shah commented on TEZ-2327:
--

What branch is this against? 

> NPE in shuffle
> --
>
> Key: TEZ-2327
> URL: https://issues.apache.org/jira/browse/TEZ-2327
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] 
> history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, 
> startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task:java.lang.NullPointerException
>at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395)
>at java.io.BufferedInputStream.close(BufferedInputStream.java:483)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629)
>at 
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759)
>at 
> org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This caused the task in question to fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-2338) Tez job failed due to AM Container-Launch failure at windows

2015-04-21 Thread Kaveen Raajan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaveen Raajan resolved TEZ-2338.

Resolution: Done

Fix we tried:
Then we downloaded dll-file fixer download and reinstalled MSVCR100.dll file in 
NM machine.
After that we tried mapreduce program for TEZ job got submitted and completed 
successfully and No ISSUE occured

> Tez job failed due to AM Container-Launch failure at windows
> 
>
> Key: TEZ-2338
> URL: https://issues.apache.org/jira/browse/TEZ-2338
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Windows server 2012 and Windows-8
> Hadoop-2.5.2
> Java-1.7
>Reporter: Kaveen Raajan
>
> I successfully Build Tez-0.6.0 against Hadoop-2.5.2
> Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html
> Moved Tez lib package to HDFS location and updated my tez-site.xml
> {code:xml}
>  
> tez.lib.uris
> ${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib/
>   
> {code}
> After that I tried the sample test for tez
> _hadoop jar tez-examples-0.6.0.jar orderedwordcount  _
> But I face following error while running this command
> *Note:* I'm using HADOOP High Availability setup.
> {code}
> Running OrderedWordCount
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/C:/Hadoop/
> share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind
> er.class]
> SLF4J: Found binding in [jar:file:/C:/Tez/lib
> /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ 
> component=tez-api
> , version=0.6.0, revision=${buildNumber}, 
> SCM-URL=scm:git:https://git-wip-us.apa
> che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ]
> 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: 
> app
> lication_1429073725727_0005
> 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is 
> deprecated.
>  Instead, use fs.defaultFS
> 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from 
> conf
> iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/
> 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta
> ging doesn't exist and is created
> 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory 
> hdfs://HACluster
> /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex
> ist and is created
> 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, 
> applicationId=a
> pplication_1429073725727_0005, dagName=OrderedWordCount
> 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application 
> application_14
> 29073725727_0005
> 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: 
> http://MASTER_NN1:8088/proxy/application_1429073725727_0005/
> 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running
> 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
> OrderedWordCount failed with diagnostics: [Application 
> application_1429073725727
> _0005 failed 2 times due to AM Container for 
> appattempt_1429073725727_0005_0
> 2 exited with  exitCode: -1073741515 due to: Exception from container-launch: 
> Ex
> itCodeException exitCode=-1073741515:
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
> 702)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
> unchContainer(DefaultContainerExecutor.java:195)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
> ontainerLaunch.call(ContainerLaunch.java:300)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
> ontainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
> java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:615)
> at java.lang.Thread.run(Thread.java:744)
> 1 file(s) moved.
> Container exited with a non-zero exit code -1073741515
> .Failing this attempt.. Failing the application.]
> {code}
> While Seeing at Resourcemanager log:
> {code}
> 2015-04-19 21:49:57,533 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> completedContainer container=Container:

[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2348:
--
Attachment: TEZ-2348.1.patch

[~jdere] - The exception from IFile is valid as higher level API is not 
expected to call nextRawKey() when the end of file is reached.  Can you please 
check if your Hive patch is calling UnorderedKVReader.next() even after it 
returns false?.  In such situations, this error is possible.

For instance, you can check for "readers.UnorderedKVReader: Num Records read:" 
in the task attempt log. This indicates that the UnoderedKVReader has finished 
processing and no more data is available. However, if higher level APIs invoke 
UnorderedKVReader.next() again, it would end up throwing EOF exception from 
IFile.  

Attaching the patch which handles this situation from tez side.  [~sseth] - Can 
you please review?

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
> Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned TEZ-2348:
-

Assignee: Rajesh Balamohan

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506407#comment-14506407
 ] 

Hitesh Shah commented on TEZ-2348:
--

[~rajesh.balamohan] Does the same issue hold for the other readers?

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506418#comment-14506418
 ] 

Gopal V commented on TEZ-2348:
--

[~rajesh.balamohan]: this needs a different exception with better messaging.

With this particular fix, someone writing incorrect code during development 
would end up with an infinite loop

{code}
while (true) { 
  reader.next();
  ...
  if (key check) { break; }
}
{code}

Some sort of exception on bad usage is far easier to debug than an infinite 
loop condition.

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506419#comment-14506419
 ] 

Rajesh Balamohan commented on TEZ-2348:
---

Sure. Will add the checks in next() method (basically need to throw the 
exception when reader's next() is called after it returns false).  Currently it 
is throwing from IFile which is somewhat misleading.

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2356) TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api

2015-04-21 Thread Thejas M Nair (JIRA)
Thejas M Nair created TEZ-2356:
--

 Summary: TEZ-2292 breaks 
VertexManagerPluginContext.reconfigureVertex api
 Key: TEZ-2356
 URL: https://issues.apache.org/jira/browse/TEZ-2356
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Thejas M Nair
Priority: Blocker


This breaks pig compilation and needs urgent attention.

{code}
src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PigGraceShuffleVertexManager.java:173:
 error: exception TezException is never thrown in body of corresponding try 
statement
[javac] } catch (TezException e) {
[javac]   ^
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2348 PreCommit Build #509

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2348
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/509/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2770 lines...]
[INFO] Final Memory: 73M/982M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727113/TEZ-2348.1.patch
  against master revision ec45c51.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/509//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/509//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
2a11b9a6392368667d5c747d76a531f63089e691 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #503
Archived 44 artifacts
Archive block size is 32768
Received 2 blocks and 2679005 bytes
Compression is 2.4%
Took 1.3 sec
Description set: TEZ-2348
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506426#comment-14506426
 ] 

TezQA commented on TEZ-2348:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727113/TEZ-2348.1.patch
  against master revision ec45c51.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/509//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/509//console

This message is automatically generated.

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2327) NPE in shuffle

2015-04-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506448#comment-14506448
 ] 

Sergey Shelukhin commented on TEZ-2327:
---

LLAP

> NPE in shuffle
> --
>
> Key: TEZ-2327
> URL: https://issues.apache.org/jira/browse/TEZ-2327
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] 
> history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, 
> startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task:java.lang.NullPointerException
>at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395)
>at java.io.BufferedInputStream.close(BufferedInputStream.java:483)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629)
>at 
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759)
>at 
> org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This caused the task in question to fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2348:
--
Attachment: TEZ-2348.2.patch

Attaching the patch, which would throw IOException when reader.next() is called 
multiple times (i.e, even after when it returns false). 

Question: However, other readers (e.g MRReaderMapReduce)  return false as well 
when multiple invocations are made.  So theoretically, they can as well get 
into tight loop with the example code you posted.  Since the example usage is 
given in KeyValueReader/KeyValuesReader, is it safe to assume that people would 
not write infinite loop code and check for the return value?.

[~hitesh] - Other readers return false when multiple invocations are made.  
However, in the case of UnorderKVReader, it was getting into the IFile path due 
to a stale reference in currentReader.  The first patch removed the stale 
reference and was returning false (like other readers).


> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, TEZ-2348.2.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2327) NPE in shuffle

2015-04-21 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2327:
-
Affects Version/s: TEZ-2003

> NPE in shuffle
> --
>
> Key: TEZ-2327
> URL: https://issues.apache.org/jira/browse/TEZ-2327
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] 
> history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, 
> startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task:java.lang.NullPointerException
>at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395)
>at java.io.BufferedInputStream.close(BufferedInputStream.java:483)
>at java.io.FilterInputStream.close(FilterInputStream.java:181)
>at 
> org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634)
>at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629)
>at 
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759)
>at 
> org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This caused the task in question to fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2356) TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api

2015-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506455#comment-14506455
 ] 

Hitesh Shah commented on TEZ-2356:
--

[~thejas] Is pig compiling against 0.7.0-SNAPSHOT? The reconfigureVertex api 
was introduced in master just recently. 

\cc [~daijy] and [~bikassaha] who have been working on something related to 
this api to address autoparallelism issues for pig. 

> TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api
> 
>
> Key: TEZ-2356
> URL: https://issues.apache.org/jira/browse/TEZ-2356
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Thejas M Nair
>Priority: Blocker
>
> This breaks pig compilation and needs urgent attention.
> {code}
> src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PigGraceShuffleVertexManager.java:173:
>  error: exception TezException is never thrown in body of corresponding try 
> statement
> [javac] } catch (TezException e) {
> [javac]   ^
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2248) VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks

2015-04-21 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2248:

Attachment: TEZ-2248-1.patch

Attach the patch. [~bikassaha] Please help review it. 

> VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks
> 
>
> Key: TEZ-2248
> URL: https://issues.apache.org/jira/browse/TEZ-2248
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
> Attachments: TEZ-2248-1.patch
>
>
> There is an if check for each termination cause which makes code long and we 
> need to handle each new termination cause with more code. This could be 
> abstracted into a method that gets termination cause string based on the enum 
> and make this method shorter and stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2248) VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks

2015-04-21 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned TEZ-2248:
---

Assignee: Jeff Zhang

> VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks
> 
>
> Key: TEZ-2248
> URL: https://issues.apache.org/jira/browse/TEZ-2248
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Jeff Zhang
> Attachments: TEZ-2248-1.patch
>
>
> There is an if check for each termination cause which makes code long and we 
> need to handle each new termination cause with more code. This could be 
> abstracted into a method that gets termination cause string based on the enum 
> and make this method shorter and stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506500#comment-14506500
 ] 

TezQA commented on TEZ-2348:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727124/TEZ-2348.2.patch
  against master revision ec45c51.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/510//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/510//console

This message is automatically generated.

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, TEZ-2348.2.patch, _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2348 PreCommit Build #510

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2348
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/510/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2770 lines...]
[INFO] Final Memory: 70M/958M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727124/TEZ-2348.2.patch
  against master revision ec45c51.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/510//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/510//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
45dc387506e817145332bd14b3eba017f50657c4 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #509
Archived 44 artifacts
Archive block size is 32768
Received 8 blocks and 2483952 bytes
Compression is 9.5%
Took 0.63 sec
Description set: TEZ-2348
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()

2015-04-21 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2348:
--
Attachment: TEZ-2348.3.patch

Attaching slightly modified version to reduce conditional checks.

> EOF exception during UnorderedKVReader.next()
> -
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Jason Dere
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, TEZ-2348.2.patch, TEZ-2348.3.patch, 
> _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>   at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2340) TestRecoveryParser fails

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506511#comment-14506511
 ] 

Jeff Zhang commented on TEZ-2340:
-

[~hitesh] Please help review it. 

> TestRecoveryParser fails
> 
>
> Key: TEZ-2340
> URL: https://issues.apache.org/jira/browse/TEZ-2340
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2340-1.patch
>
>
> Stacktrace
> {code}
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
> {code}
> Standard Output
> {code}
> 2015-04-17 07:23:55,672 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]:
>  it still exists.
> 2015-04-17 07:23:55,674 WARN  [main] fs.FileUtil 
> (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir 
> [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]:
>  it still exists.
> 2015-04-17 07:23:55,703 INFO  [Thread-5] impl.TestDAGImpl 
> (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan
> 2015-04-17 07:23:55,722 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService
> 2015-04-17 07:23:55,723 INFO  [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:serviceStart(127)) - Starting RecoveryService
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(314)) - Error handling summary event, 
> eventType=DAG_SUBMITTED
> java.io.IOException: Not supported
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365)
>   at 
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285)
>   at 
> org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt 
> does not start up, 
> flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred
> 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService 
> (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all 
> events
> 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] 
> recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure 
> occurred. Stopping recovery thread. Current eventQueueSize=0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2342) TestFaultTolerance.testRandomFailingTasks fails due to timeout

2015-04-21 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506515#comment-14506515
 ] 

Jeff Zhang commented on TEZ-2342:
-

[~hitesh] [~bikassaha] Please help review it. 

> TestFaultTolerance.testRandomFailingTasks fails due to timeout
> --
>
> Key: TEZ-2342
> URL: https://issues.apache.org/jira/browse/TEZ-2342
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2342-1.patch, syslog_dag_1429582868137_0001_1
>
>
> {code}
> Error Message
> test timed out after 12 milliseconds
> Stacktrace
> java.lang.Exception: test timed out after 12 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:126)
>   at 
> org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:114)
>   at 
> org.apache.tez.test.TestFaultTolerance.testRandomFailingTasks(TestFaultTolerance.java:723)
> Standard Output
> 2015-04-17 07:46:10,952 INFO  [main] test.TestFaultTolerance 
> (TestFaultTolerance.java:setup(65)) - Starting mini clusters
> 2015-04-17 07:46:11,508 INFO  [main] hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:(446)) - starting cluster: numNameNodes=1, 
> numDataNodes=1
> Formatting using clusterid: testClusterID
> 2015-04-17 07:46:12,919 INFO  [main] namenode.FSNamesystem 
> (FSNamesystem.java:(716)) - No KeyProvider found.
> 2015-04-17 07:46:12,920 INFO  [main] namenode.FSNamesystem 
> (FSNamesystem.java:(726)) - fsLock is fair:true
> 2015-04-17 07:46:13,021 INFO  [main] Configuration.deprecation 
> (Configuration.java:warnOnceIfDeprecated(1173)) - 
> hadoop.configured.node.mapping is deprecated. Instead, use 
> net.topology.configured.node.mapping
> 2015-04-17 07:46:13,021 INFO  [main] blockmanagement.DatanodeManager 
> (DatanodeManager.java:(239)) - dfs.block.invalidate.limit=1000
> 2015-04-17 07:46:13,022 INFO  [main] blockmanagement.DatanodeManager 
> (DatanodeManager.java:(245)) - 
> dfs.namenode.datanode.registration.ip-hostname-check=true
> 2015-04-17 07:46:13,022 INFO  [main] blockmanagement.BlockManager 
> (InvalidateBlocks.java:printBlockDeletionTime(71)) - 
> dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
> 2015-04-17 07:46:13,025 INFO  [main] blockmanagement.BlockManager 
> (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will 
> start around 2015 Apr 17 07:46:13
> 2015-04-17 07:46:13,029 INFO  [main] util.GSet 
> (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map 
> BlocksMap
> 2015-04-17 07:46:13,030 INFO  [main] util.GSet 
> (LightWeightGSet.java:computeCapacity(355)) - VM type   = 64-bit
> 2015-04-17 07:46:13,032 INFO  [main] util.GSet 
> (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 910.3 MB = 18.2 
> MB
> 2015-04-17 07:46:13,033 INFO  [main] util.GSet 
> (LightWeightGSet.java:computeCapacity(361)) - capacity  = 2^21 = 2097152 
> entries
> 2015-04-17 07:46:13,079 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:createBlockTokenSecretManager(365)) - 
> dfs.block.access.token.enable=false
> 2015-04-17 07:46:13,080 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(350)) - defaultReplication = 1
> 2015-04-17 07:46:13,080 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(351)) - maxReplication = 512
> 2015-04-17 07:46:13,083 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(352)) - minReplication = 1
> 2015-04-17 07:46:13,083 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(353)) - maxReplicationStreams  = 2
> 2015-04-17 07:46:13,083 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(354)) - shouldCheckForEnoughRacks  = false
> 2015-04-17 07:46:13,084 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(355)) - replicationRecheckInterval = 3000
> 2015-04-17 07:46:13,084 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(356)) - encryptDataTransfer= false
> 2015-04-17 07:46:13,084 INFO  [main] blockmanagement.BlockManager 
> (BlockManager.java:(357)) - maxNumBlocksToLog  = 1000
> 2015-04-17 07:46:13,115 INFO  [main] namenode.FSNamesystem 
> (FSNamesystem.java:(746)) - fsOwner = jenkins (auth:SIMPLE)
> 2015-04-17 07:46:13,116 INFO  [main] namenode.FSNamesystem 
> (FSNamesystem.java:(747)) - supergroup  = supergroup
> 2015-04-17 07:46:13,116 INFO  [main] namenode.FSNamesystem 
> (FSNamesystem.java:(748)) - isPermissionEnabled = true
> 2015-04-17 07:46:13,116 INFO  [main] namenode.FSNamesystem 
> (FSNamesystem.java:(759)) - HA Enabled: false
> 2015-04-17 07:46:13,120 INFO

[jira] [Updated] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows

2015-04-21 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2341:

Attachment: TEZ-2341-2.patch

Attach the patch (only verify it in linux platform)
[~bikassaha] [~hitesh] Please help review it. 

> TestMockDAGAppMaster.testBasicCounters fails on windows
> ---
>
> Key: TEZ-2341
> URL: https://issues.apache.org/jira/browse/TEZ-2341
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Minor
> Attachments: TEZ-2341-1.patch, TEZ-2341-2.patch
>
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2248 PreCommit Build #511

2015-04-21 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2248
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/511/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2770 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727132/TEZ-2248-1.patch
  against master revision ec45c51.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/511//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/511//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
93ca1ca0569d82b4ae3aa988fe7c1aed46a19378 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #510
Archived 44 artifacts
Archive block size is 32768
Received 6 blocks and 2552855 bytes
Compression is 7.2%
Took 2 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2248) VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks

2015-04-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506534#comment-14506534
 ] 

TezQA commented on TEZ-2248:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12727132/TEZ-2248-1.patch
  against master revision ec45c51.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/511//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/511//console

This message is automatically generated.

> VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks
> 
>
> Key: TEZ-2248
> URL: https://issues.apache.org/jira/browse/TEZ-2248
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Jeff Zhang
> Attachments: TEZ-2248-1.patch
>
>
> There is an if check for each termination cause which makes code long and we 
> need to handle each new termination cause with more code. This could be 
> abstracted into a method that gets termination cause string based on the enum 
> and make this method shorter and stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)