[jira] [Created] (TEZ-2948) Stop using dagName in the dagComplete notification to TaskCommunicators

2015-11-18 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2948:
---

 Summary: Stop using dagName in the dagComplete notification to 
TaskCommunicators
 Key: TEZ-2948
 URL: https://issues.apache.org/jira/browse/TEZ-2948
 Project: Apache Tez
  Issue Type: Task
Affects Versions: 0.8.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2480) TEZ-2003: exception when closing output (ignored)

2015-11-18 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned TEZ-2480:
---

Assignee: Siddharth Seth

> TEZ-2003: exception when closing output (ignored)
> -
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2480) TEZ-2003: exception when closing output (ignored)

2015-11-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011768#comment-15011768
 ] 

Sergey Shelukhin commented on TEZ-2480:
---

+1

> TEZ-2003: exception when closing output (ignored)
> -
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2948) Stop using dagName in the dagComplete notification to TaskCommunicators

2015-11-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011797#comment-15011797
 ] 

Hitesh Shah commented on TEZ-2948:
--

+1 pending pre-commit

> Stop using dagName in the dagComplete notification to TaskCommunicators
> ---
>
> Key: TEZ-2948
> URL: https://issues.apache.org/jira/browse/TEZ-2948
> Project: Apache Tez
>  Issue Type: Task
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2948.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2949) Allow duplicate dag names within session for Tez

2015-11-18 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2949:


 Summary: Allow duplicate dag names within session for Tez
 Key: TEZ-2949
 URL: https://issues.apache.org/jira/browse/TEZ-2949
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah


Hive would like to support setting hive.query.name ( HIVE-12357 ) by users. 
Hence this will create dag name clashes. This jira is to relax the dag name 
uniqueness requirement. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2480 PreCommit Build #1323

2015-11-18 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2480
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1323/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2693 lines...]
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-runtime-internals




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12773051/TEZ-2480.1.txt
  against master revision e5e4fc7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.runtime.api.impl.TestProcessorContext

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1323//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1323//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
3d02b8709a5867f7165e36f23da23441d4e3df47 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.tez.runtime.api.impl.TestProcessorContext.testDagNumber

Error Message:
null

Stack Trace:
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.RuntimeTask.notifyProgressInvocation(RuntimeTask.java:109)
at 
org.apache.tez.runtime.api.impl.TezTaskContextImpl.notifyProgress(TezTaskContextImpl.java:178)
at 
org.apache.tez.runtime.api.impl.TezProcessorContextImpl.setProgress(TezProcessorContextImpl.java:97)
at 
org.apache.tez.runtime.api.impl.TestProcessorContext.testDagNumber(TestProcessorContext.java:101)




[jira] [Created] (TEZ-2946) Tez UI: At times RM return a huge error message making the yellow error bar to fill the whole screen.

2015-11-18 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2946:
---

 Summary: Tez UI: At times RM return a huge error message making 
the yellow error bar to fill the whole screen.
 Key: TEZ-2946
 URL: https://issues.apache.org/jira/browse/TEZ-2946
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2947) Tez UI: Timeline, RM & AM requests gets into a consecutive loop in counters page without any delay

2015-11-18 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2947:
---

 Summary: Tez UI: Timeline, RM & AM requests gets into a 
consecutive loop in counters page without any delay
 Key: TEZ-2947
 URL: https://issues.apache.org/jira/browse/TEZ-2947
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2948) Stop using dagName in the dagComplete notification to TaskCommunicators

2015-11-18 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2948:

Attachment: TEZ-2948.1.txt

Moves to using the dag index - which will be unique within an application.

[~hitesh] - please review.


> Stop using dagName in the dagComplete notification to TaskCommunicators
> ---
>
> Key: TEZ-2948
> URL: https://issues.apache.org/jira/browse/TEZ-2948
> Project: Apache Tez
>  Issue Type: Task
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2948.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2480) TEZ-2003: exception when closing output (ignored)

2015-11-18 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011861#comment-15011861
 ] 

TezQA commented on TEZ-2480:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12773051/TEZ-2480.1.txt
  against master revision e5e4fc7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.runtime.api.impl.TestProcessorContext

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1323//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1323//console

This message is automatically generated.

> TEZ-2003: exception when closing output (ignored)
> -
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2949) Allow duplicate dag names within session for Tez

2015-11-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2949:
-
Hadoop Flags: Incompatible change

> Allow duplicate dag names within session for Tez
> 
>
> Key: TEZ-2949
> URL: https://issues.apache.org/jira/browse/TEZ-2949
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>
> Hive would like to support setting hive.query.name ( HIVE-12357 ) by users. 
> Hence this will create dag name clashes. This jira is to relax the dag name 
> uniqueness requirement. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2949) Allow duplicate dag names within session for Tez

2015-11-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2949:
-
Attachment: TEZ-2949.1.patch

> Allow duplicate dag names within session for Tez
> 
>
> Key: TEZ-2949
> URL: https://issues.apache.org/jira/browse/TEZ-2949
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
> Attachments: TEZ-2949.1.patch
>
>
> Hive would like to support setting hive.query.name ( HIVE-12357 ) by users. 
> Hence this will create dag name clashes. This jira is to relax the dag name 
> uniqueness requirement. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2948 PreCommit Build #1324

2015-11-18 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2948
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1324/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2511 lines...]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-api




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12773050/TEZ-2948.1.txt
  against master revision e5e4fc7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.client.TestTezClient

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1324//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1324//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
fc9809f0f7925865f7aafa952cb9741e97c1043e logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Compressed 3.09 MB of artifacts by 21.3% relative to #1306
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.tez.client.TestTezClient.testStopRetriesUntilTimeout

Error Message:
test timed out after 5000 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 5000 milliseconds
at java.lang.Thread.sleep(Native Method)
at org.apache.tez.client.TezClient.stop(TezClient.java:589)
at 
org.apache.tez.client.TestTezClient.testStopRetriesUntilTimeout(TestTezClient.java:557)




[jira] [Commented] (TEZ-2948) Stop using dagName in the dagComplete notification to TaskCommunicators

2015-11-18 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011886#comment-15011886
 ] 

TezQA commented on TEZ-2948:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12773050/TEZ-2948.1.txt
  against master revision e5e4fc7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.client.TestTezClient

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1324//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1324//console

This message is automatically generated.

> Stop using dagName in the dagComplete notification to TaskCommunicators
> ---
>
> Key: TEZ-2948
> URL: https://issues.apache.org/jira/browse/TEZ-2948
> Project: Apache Tez
>  Issue Type: Task
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2948.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2877) Tez UI: Remove duplicate error handling code

2015-11-18 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram reassigned TEZ-2877:
---

Assignee: Sreenath Somarajapuram

> Tez UI: Remove duplicate error handling code
> 
>
> Key: TEZ-2877
> URL: https://issues.apache.org/jira/browse/TEZ-2877
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
>Priority: Minor
>
> 1. Code to display error log and error bar is duplicated everywhere. Create a 
> helper function that accepts an error object and does the same. Also replace 
> all the duplicate code with this function.
> 2. When ATS is down, I see the following message: 
> "error code: Unknown, message: Error while loading tez-app.index.
> Could not retrieve expected data from Timeline Server @ 
> http://localhost:8188/ws/v1/timeline/TEZ_APPLICATION/tez_application_1447798385040_0001;
> It seems wrong to print unknown error code.
> 3. "Info! Could not fetch application info from RM (yarn system metrics 
> publishing might be disabled), some details might be missing" 
> This message should be changed to "Info! Could not fetch application info 
> from YARN RM/Timeline (yarn system metrics publishing might be disabled), 
> some details might be missing"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2877) Tez UI: Remove duplicate error handling code

2015-11-18 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2877:

Description: 
1. Code to display error log and error bar is duplicated everywhere. Create a 
helper function that accepts an error object and does the same. Also replace 
all the duplicate code with this function.

2. When ATS is down, I see the following message: 
"error code: Unknown, message: Error while loading tez-app.index.
Could not retrieve expected data from Timeline Server @ 
http://localhost:8188/ws/v1/timeline/TEZ_APPLICATION/tez_application_1447798385040_0001;
It seems wrong to print unknown error code.

3. "Info! Could not fetch application info from RM (yarn system metrics 
publishing might be disabled), some details might be missing" 
This message should be changed to "Info! Could not fetch application info from 
YARN RM/Timeline (yarn system metrics publishing might be disabled), some 
details might be missing"

  was:Code to display error log and error bar is duplicated everywhere. Create 
a helper function that accepts an error object and does the same. Also replace 
all the duplicate code with this function.


> Tez UI: Remove duplicate error handling code
> 
>
> Key: TEZ-2877
> URL: https://issues.apache.org/jira/browse/TEZ-2877
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Priority: Minor
>
> 1. Code to display error log and error bar is duplicated everywhere. Create a 
> helper function that accepts an error object and does the same. Also replace 
> all the duplicate code with this function.
> 2. When ATS is down, I see the following message: 
> "error code: Unknown, message: Error while loading tez-app.index.
> Could not retrieve expected data from Timeline Server @ 
> http://localhost:8188/ws/v1/timeline/TEZ_APPLICATION/tez_application_1447798385040_0001;
> It seems wrong to print unknown error code.
> 3. "Info! Could not fetch application info from RM (yarn system metrics 
> publishing might be disabled), some details might be missing" 
> This message should be changed to "Info! Could not fetch application info 
> from YARN RM/Timeline (yarn system metrics publishing might be disabled), 
> some details might be missing"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2952) NPE in TestOnFileUnorderedKVOutput

2015-11-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created TEZ-2952:
---

 Summary: NPE in TestOnFileUnorderedKVOutput
 Key: TEZ-2952
 URL: https://issues.apache.org/jira/browse/TEZ-2952
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang


https://builds.apache.org/job/Tez-Build/1316/console
{code}
testWithPipelinedShuffle(org.apache.tez.runtime.library.output.TestOnFileUnorderedKVOutput)
  Time elapsed: 0.815 sec  <<< ERROR!
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.RuntimeTask.notifyProgressInvocation(RuntimeTask.java:109)
at 
org.apache.tez.runtime.api.impl.TezTaskContextImpl.notifyProgress(TezTaskContextImpl.java:178)
at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:323)
at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:256)
at 
org.apache.tez.runtime.library.output.TestOnFileUnorderedKVOutput.testWithPipelinedShuffle(TestOnFileUnorderedKVOutput.java:179)

testGeneratedDataMovementEvent(org.apache.tez.runtime.library.output.TestOnFileUnorderedKVOutput)
  Time elapsed: 0.082 sec  <<< ERROR!
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.RuntimeTask.notifyProgressInvocation(RuntimeTask.java:109)
at 
org.apache.tez.runtime.api.impl.TezTaskContextImpl.notifyProgress(TezTaskContextImpl.java:178)
at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:253)
at 
org.apache.tez.runtime.library.output.TestOnFileUnorderedKVOutput.testGeneratedDataMovementEvent(TestOnFileUnorderedKVOutput.java:138)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012492#comment-15012492
 ] 

Rajesh Balamohan commented on TEZ-2950:
---

sizePerBuffer  seems to be low as per the log. 
Can you please check with 
tez.task.scale.memory.ratios="PARTITIONED_UNSORTED_OUTPUT:12,UNSORTED_INPUT:1,UNSORTED_OUTPUT:1,SORTED_OUTPUT:12,SORTED_MERGED_INPUT:12,PROCESSOR:1,OTHER:4"

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012291#comment-15012291
 ] 

Gopal V commented on TEZ-2950:
--

[~rohini]: this is already implemented for UnorderedPartitioned, right?

set tez.runtime.enable.final-merge.in.output = false;

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012390#comment-15012390
 ] 

Rohini Palaniswamy commented on TEZ-2950:
-

I meant not possible to ask the user to set that and run. Need a new release of 
Pig for that.

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012435#comment-15012435
 ] 

Siddharth Seth commented on TEZ-2950:
-

bq. The downstream only starts receiving events if the source task completes 
successfully - this was done to allow for speculative execution.
Node failures. Destination receives all events, but processes only some of them 
before the source node dies. Smaller chance of hitting this compared to 
pipelined shuffle but it's still possible.

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2480) Exception when closing output (ignored)

2015-11-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2480:
-
Summary: Exception when closing output (ignored)  (was: TEZ-2003: exception 
when closing output (ignored))

> Exception when closing output (ignored)
> ---
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2480) Exception when closing output (ignored)

2015-11-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012284#comment-15012284
 ] 

Hitesh Shah commented on TEZ-2480:
--

Committing shortly. 

> Exception when closing output (ignored)
> ---
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012429#comment-15012429
 ] 

Gopal V commented on TEZ-2950:
--

bq, enable final merge in output = false doesn't necessarily solve this. That 
has the same issues of partial failures which exists with pipelined shuffle. 
The fetcher can start serving out chunks of the data and then have the source 
fail, which will cause the task fetching the data to fail (chunks for the same 
input from different attempts of the source).

The downstream only starts receiving events if the source task completes 
successfully - this was done to allow for speculative execution.

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012371#comment-15012371
 ] 

Gopal V commented on TEZ-2950:
--

That is a per-edge configuration.

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2480) TEZ-2003: exception when closing output (ignored)

2015-11-18 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012273#comment-15012273
 ] 

Rajesh Balamohan commented on TEZ-2480:
---

lgtm. +1


> TEZ-2003: exception when closing output (ignored)
> -
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2951) Progressive allocation of buffers for Unordered Partitioned Output

2015-11-18 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-2951:
---

 Summary: Progressive allocation of buffers for Unordered 
Partitioned Output
 Key: TEZ-2951
 URL: https://issues.apache.org/jira/browse/TEZ-2951
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth


Similar to TEZ-2244. In the case of UnorderedPartitionOutput - the default is 
to use 2 buffers. This can be changed to a higher value when using pipelined 
shuffle - and have memory allocated only when required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012279#comment-15012279
 ] 

Rohini Palaniswamy commented on TEZ-2950:
-

Copying response from [~sseth] on d...@tez.apache.org below.

In case of UnorderedKVWriter (non-partitioned), a single file is used - to
which new entries are appended.

For the partitioned case - using a single file is not as straightforward,
since the number of elements and size of each partition is not known up
front. Generating a single file per partition can cause an explosion in the
number of files, as well as the number of streams open in parallel (OOM).
The current partitioned writer writes data into the in-memory buffer and
then spills this into files with individual partitions consolidated
together.
Without pipelined shuffle - a single file needs to be generated for a
single task, which is where the merge step comes in - in case the buffer is
large. With pipelined shuffle - there's almost no extra cost, since there's
no final merge - and each element is written out exactly once.

That said, optimizations are possible depending upon the use case. e.g. For
a small number of partitions - it's reasonable to write out a file per
partition. However, the ShuffleHandle and shuffle code will need to change
to handle this.

Pipelined Shuffle/avoiding a final merge has some limitations in case of
failures and partial chunks being transferred over. It should be possible
to work around these by modifying the receiving side to process each input
only when all data for that source has been received.

Either way, non trivial changes are required to make this more efficient.

In this particular case, how many partitions were generated, and what was
the size of the unordered output buffer ? Increasing the buffer size for
this particular job can help mitigating the problem - maybe not with 8500
spills though.



> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012280#comment-15012280
 ] 

Rohini Palaniswamy commented on TEZ-2950:
-

The approach is proving very bad in terms of performance as decompressing and 
merging large amount of spills is taking a long time for 999 partitions. We 
have a join followed by UNION.  Join vertex uses UnorderedKVWriter for output 
as no sort is required to input to Union. For merging of 8436 spills, it is 
taking 30 mins.

2015-11-18 21:01:25,904 [INFO] [main] |resources.MemoryDistributor|: 
InitialMemoryDistributor (isEnabled=true) invoked with: numInputs=2, 
numOutputs=1, JVM.maxFree=3102212096, 
allocatorClassName=org.apache.tez.runtime.library.resources.WeightedScalingMemoryDistributor
2015-11-18 21:01:26,295 [INFO] [TezChild] |resources.MemoryDistributor|: 
InitialRequests=[scope-6987:OUTPUT:104857600:org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput],
 
[scope-6974:INPUT:1861327360:org.apache.tez.runtime.library.input.OrderedGroupedKVInput],
 
[scope-6957:INPUT:1861327360:org.apache.tez.runtime.library.input.OrderedGroupedKVInput]
2015-11-18 21:01:26,303 [INFO] [TezChild] 
|resources.WeightedScalingMemoryDistributor|: 
ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
2015-11-18 21:01:26,307 [INFO] [TezChild] 
|resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.5, 
AdditionalReservationFractionForIOs=0.045, finalReserveFractionUsed=0.545
2015-11-18 21:01:26,308 [INFO] [TezChild] 
|resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 3, 
numScaledRequests: 25, TotalRequested: 3827512320, TotalRequestedScaled: 
1.7910685696E9, TotalJVMHeap: 3102212096, TotalAvailable: 1411506503, 
TotalRequested/TotalJVMHeap:1.23
2015-11-18 21:01:26,308 [INFO] [TezChild] |resources.MemoryDistributor|: 
Allocations=[scope-6987:org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:OUTPUT:104857600:3305449],
 
[scope-6974:org.apache.tez.runtime.library.input.OrderedGroupedKVInput:INPUT:1861327360:704100526],
 
[scope-6957:org.apache.tez.runtime.library.input.OrderedGroupedKVInput:INPUT:1861327360:704100526]
2015-11-18 21:02:49,010 [INFO] [TezChild] 
|writers.UnorderedPartitionedKVWriter|: scope_6987: numBuffers=2, 
sizePerBuffer=1652724, skipBuffers=false, pipelinedShuffle=false, 
numPartitions=999
..
2015-11-18 21:21:03,353 [INFO] [UnorderedOutSpiller {scope_6987}] 
|writers.UnorderedPartitionedKVWriter|: scope_6987: Finished spill 8436
2015-11-18 21:21:04,236 [INFO] [TezChild] |task.TezTaskRunner|: Closing task, 
taskAttemptId=attempt_1444575566264_610936_1_28_000475_0
..
2015-11-18 21:21:04,238 [INFO] [TezChild] 
|writers.UnorderedPartitionedKVWriter|: scope_6987: Waiting for all spills to 
complete : Pending : 0
2015-11-18 21:21:04,238 [INFO] [TezChild] 
|writers.UnorderedPartitionedKVWriter|: scope_6987: All spills complete
2015-11-18 21:54:44,047 [INFO] [TezChild] 
|writers.UnorderedPartitionedKVWriter|: scope_6987: Finished final spill after 
merging : 8438 spills


> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012338#comment-15012338
 ] 

Rohini Palaniswamy commented on TEZ-2950:
-

bq. set tez.runtime.enable.final-merge.in.output = false;
   Doubt can do that as it will affect other parts of the DAG which have 
OrderedPartitioned .

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012429#comment-15012429
 ] 

Gopal V edited comment on TEZ-2950 at 11/19/15 12:09 AM:
-

bq. enable final merge in output = false doesn't necessarily solve this. That 
has the same issues of partial failures which exists with pipelined shuffle. 
The fetcher can start serving out chunks of the data and then have the source 
fail, which will cause the task fetching the data to fail (chunks for the same 
input from different attempts of the source).

The downstream only starts receiving events if the source task completes 
successfully - this was done to allow for speculative execution.


was (Author: gopalv):
bq, enable final merge in output = false doesn't necessarily solve this. That 
has the same issues of partial failures which exists with pipelined shuffle. 
The fetcher can start serving out chunks of the data and then have the source 
fail, which will cause the task fetching the data to fail (chunks for the same 
input from different attempts of the source).

The downstream only starts receiving events if the source task completes 
successfully - this was done to allow for speculative execution.

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2480) Exception when closing output is ignored

2015-11-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2480:
-
Summary: Exception when closing output is ignored  (was: Exception when 
closing output (ignored))

> Exception when closing output is ignored
> 
>
> Key: TEZ-2480
> URL: https://issues.apache.org/jira/browse/TEZ-2480
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: TEZ-2003
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: TEZ-2480.1.txt
>
>
> Happens a lot in some queries:
> {noformat}
> sershe_20150522112029_d0863b33-8d2f-4b4c-b013-9ef70a2bc586:1_Map 1_8_0)] WARN 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing output Reducer 2(cleanup). Exception 
> class=java.lang.NullPointerException, message=null
> java.lang.NullPointerException
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:618)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:81)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:613)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:831)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:608)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:198)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:64)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:56)
> at 
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord.(TezSpillRecord.java:51)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.generateEvents(OrderedPartitionedKVOutput.java:209)
> at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:186)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:849)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Can this be fixed or not logged?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012424#comment-15012424
 ] 

Siddharth Seth commented on TEZ-2950:
-

[~gopalv] - enable final merge in output = false doesn't necessarily solve 
this. That has the same issues of partial failures which exists with pipelined 
shuffle. The fetcher can start serving out chunks of the data and then have the 
source fail, which will cause the task fetching the data to fail (chunks for 
the same input from different attempts of the source).

[~rohini] - the total size of the unordered buffer is getting scaled down to 
~3MB from the initially requested 100MB.
The job has an initial configuration of io.sort.mb =~ 1800MB, 
unordered.buffer.size=100MB. With two OrderedOutputs - the unordered output 
gets scaled down.
For the particular job
1) Increase the size of the unordered buffer (1800 / 100 seems skewed anyway)
2) Change the scaling ratios. Currently: PARTITIONED_UNSORTED_OUTPUT:1, 
SORTED_OUTPUT:12, PARTITIONED_UNSORTED can be increased to prevent it from 
being scaled down a lot.

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned TEZ-2950:


Assignee: Jonathan Eagles

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Jonathan Eagles
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

2015-11-18 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-2950:
-
Assignee: (was: Jonathan Eagles)

> Poor performance of UnorderedPartitionedKVWriter
> 
>
> Key: TEZ-2950
> URL: https://issues.apache.org/jira/browse/TEZ-2950
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in 
> UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data 
> from spill files (8500 spills) and then writing the final compressed merge 
> file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not 
> just buffer and keep directly writing to the final file which will save a lot 
> of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2948) Stop using dagName in the dagComplete notification to TaskCommunicators

2015-11-18 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011977#comment-15011977
 ] 

Siddharth Seth commented on TEZ-2948:
-

The test failure is unrelated. Committing. Thanks for the review.

> Stop using dagName in the dagComplete notification to TaskCommunicators
> ---
>
> Key: TEZ-2948
> URL: https://issues.apache.org/jira/browse/TEZ-2948
> Project: Apache Tez
>  Issue Type: Task
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2948.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)