Failed: TEZ-2289 PreCommit Build #439

2015-04-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2289
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/439/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2009 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12723923/TEZ-2289.1.patch
  against master revision f382828.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/439//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/439//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
bb60f871df5b55880383c0d54198d5c8ec952989 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #435
Archived 44 artifacts
Archive block size is 32768
Received 4 blocks and 2572688 bytes
Compression is 4.8%
Took 1.3 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2289) ATSHistoryLoggingService can generate ArrayOutOfBoundsException

2015-04-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490262#comment-14490262
 ] 

TezQA commented on TEZ-2289:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12723923/TEZ-2289.1.patch
  against master revision f382828.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/439//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/439//console

This message is automatically generated.

> ATSHistoryLoggingService can generate ArrayOutOfBoundsException
> ---
>
> Key: TEZ-2289
> URL: https://issues.apache.org/jira/browse/TEZ-2289
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Chang Li
> Attachments: TEZ-2289.1.patch, TEZ-2289.patch
>
>
> 2015-04-07 23:11:20,459 INFO [main] app.DAGAppMaster: Running DAG: MRRSleepJob
> 2015-04-07 23:11:20,546 INFO [IPC Server handler 0 on 50500] ipc.Server: IPC 
> Server handler 0 on 50500, call 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.getDAGStatus 
> from 127.0.0.1:53151 Call#93 Retry#0
> org.apache.tez.dag.api.TezException: No running dag at present
> at 
> org.apache.tez.dag.api.client.DAGClientHandler.getDAG(DAGClientHandler.java:84)
> at 
> org.apache.tez.dag.api.client.DAGClientHandler.getACLManager(DAGClientHandler.java:151)
> at 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:94)
> at 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7375)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> 2015-04-07 23:11:20,875 INFO [main] history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1427297554817_0149_1][Event:DAG_SUBMITTED]: 
> dagID=dag_1427297554817_0149_1, submitTime=1428448280397
> 2015-04-07 23:11:20,905 WARN [HistoryEventHandlingThread] 
> ats.ATSHistoryLoggingService: Could not post history event to ATS, 
> atsPutError=6, entityId=dag_1427297554817_0149_1, eventType=DAG_SUBMITTED
> 2015-04-07 23:11:20,906 WARN [HistoryEventHandlingThread] 
> ats.ATSHistoryLoggingService: Could not handle history events
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.handleEvents(ATSHistoryLoggingService.java:312)
> at 
> org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.access$700(ATSHistoryLoggingService.java:50)
> at 
> org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService$1.run(ATSHistoryLoggingService.java:159)
> at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490282#comment-14490282
 ] 

Bikas Saha commented on TEZ-2234:
-

Will add annotations.
getDataSize() is the logical data size as written by the user. The closest 
thing to that is OUTPUT_BYTES. The difference between them for many jobs is 
large enough that perhaps we should look at reducing the overhead.
Yes, plugins are not getting task level info for now. Not needed for PIG-4434. 
The docs specify that the values are point in time and may change with 
progress/failures/refreshes.
This cannot get rid of VM events as there is no way to correlate between tasks 
and output size and so the extrapolation of current output size to final output 
size based on current completed tasks to total tasks does not work. So the VM 
events are still needed until (if ever) we start exposing task level sizes.

Thanks for the reviews!

> Allow vertex managers to get output size per source vertex
> --
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch, TEZ-2234.3.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-10 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: TEZ-2234.4.patch

Patch with annotations.

> Allow vertex managers to get output size per source vertex
> --
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch, TEZ-2234.3.patch, 
> TEZ-2234.4.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2275) TEZ UI: Make data loading faster and caching better

2015-04-10 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2275:

Description: 
Loading:
# Remove counter serialization for all entities to make loading faster.
# Ensure that counter tables works with unserialized data.
# Ensure display of counter data in entity tables.

Caching:
# Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
update parts values displayed in All Tasks table.
# Refreshing an entity, shouldn't change the respective record in any tables.
# Ensure all data displayed in a table are from a single time frame. (Now part 
of the table should be partially updated by actions elsewhere)
# Values cached at a time for a specific entity type should belong to a single 
DAG. (To limit the values stores in the browser side)

  was:
# Remove counter serialization for all entities to make loading faster.
# Ensure that the counter tables would with unserialized data.

# Refreshing vertex->tasks shouldn't update cached values displayed in the All 
Tasks table.
# As records are shared, refreshing/reloading a record will reflect the changes 
everywhere. Hence refreshing task details will update the respective task row 
in dag/vertex -> tasks tables. 
# Values cached at a time for a specific entity type should belong to a single 
DAG. (To limit the values stores in the browser side)


> TEZ UI: Make data loading faster and caching better
> ---
>
> Key: TEZ-2275
> URL: https://issues.apache.org/jira/browse/TEZ-2275
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2275.1.patch
>
>
> Loading:
> # Remove counter serialization for all entities to make loading faster.
> # Ensure that counter tables works with unserialized data.
> # Ensure display of counter data in entity tables.
> Caching:
> # Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
> update parts values displayed in All Tasks table.
> # Refreshing an entity, shouldn't change the respective record in any tables.
> # Ensure all data displayed in a table are from a single time frame. (Now 
> part of the table should be partially updated by actions elsewhere)
> # Values cached at a time for a specific entity type should belong to a 
> single DAG. (To limit the values stores in the browser side)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2275 PreCommit Build #440

2015-04-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2275
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/440/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1243 lines...]

  Running tests 
  /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -DTezPatchProcess
cat: 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt:
 No such file or directory
awk: cannot open 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt
 (No such file or directory)




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724634/TEZ-2275.1.patch
  against master revision f382828.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/440//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/440//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
bd7a1d266253759a341b0d18effefaa8efad6111 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-2275) TEZ UI: Make data loading faster and caching better

2015-04-10 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2275:

Description: 
Loading:
# Remove counter serialization for all entities to make loading faster.
# Ensure that counter tables works with unserialized data.
# Ensure display of counter data in entity tables.

Caching:
# Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
update values displayed in All Tasks table.
# Refreshing an entity, shouldn't change the respective record in any tables.
# Ensure all data displayed in a table are from a single time frame. (Now part 
of the table should be partially updated by actions elsewhere)
# Values cached at a time for a specific entity type should belong to a single 
DAG. (To limit the values stores in the browser side)

  was:
Loading:
# Remove counter serialization for all entities to make loading faster.
# Ensure that counter tables works with unserialized data.
# Ensure display of counter data in entity tables.

Caching:
# Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
update parts values displayed in All Tasks table.
# Refreshing an entity, shouldn't change the respective record in any tables.
# Ensure all data displayed in a table are from a single time frame. (Now part 
of the table should be partially updated by actions elsewhere)
# Values cached at a time for a specific entity type should belong to a single 
DAG. (To limit the values stores in the browser side)


> TEZ UI: Make data loading faster and caching better
> ---
>
> Key: TEZ-2275
> URL: https://issues.apache.org/jira/browse/TEZ-2275
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2275.1.patch
>
>
> Loading:
> # Remove counter serialization for all entities to make loading faster.
> # Ensure that counter tables works with unserialized data.
> # Ensure display of counter data in entity tables.
> Caching:
> # Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
> update values displayed in All Tasks table.
> # Refreshing an entity, shouldn't change the respective record in any tables.
> # Ensure all data displayed in a table are from a single time frame. (Now 
> part of the table should be partially updated by actions elsewhere)
> # Values cached at a time for a specific entity type should belong to a 
> single DAG. (To limit the values stores in the browser side)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2275) TEZ UI: Make data loading faster and caching better

2015-04-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490296#comment-14490296
 ] 

TezQA commented on TEZ-2275:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724634/TEZ-2275.1.patch
  against master revision f382828.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/440//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/440//console

This message is automatically generated.

> TEZ UI: Make data loading faster and caching better
> ---
>
> Key: TEZ-2275
> URL: https://issues.apache.org/jira/browse/TEZ-2275
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2275.1.patch
>
>
> Loading:
> # Remove counter serialization for all entities to make loading faster.
> # Ensure that counter tables works with unserialized data.
> # Ensure display of counter data in entity tables.
> Caching:
> # Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
> update values displayed in All Tasks table.
> # Refreshing an entity, shouldn't change the respective record in any tables.
> # Ensure all data displayed in a table are from a single time frame. (Now 
> part of the table should be partially updated by actions elsewhere)
> # Values cached at a time for a specific entity type should belong to a 
> single DAG. (To limit the values stores in the browser side)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490297#comment-14490297
 ] 

Bikas Saha commented on TEZ-714:


Would be good if v1 and v2 also had some outputs of their own. Then we would 
cover the case of vertex success committing its own output and then dag success 
committing shared outputs for the same vertex.

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-10.patch, 
> TEZ-714-11.patch, TEZ-714-12.patch, TEZ-714-2.patch, TEZ-714-3.patch, 
> TEZ-714-4.patch, TEZ-714-5.patch, TEZ-714-6.patch, TEZ-714-7.patch, 
> TEZ-714-8.patch, TEZ-714-9.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2275) TEZ UI: Make data loading faster and caching better

2015-04-10 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2275:

Description: 
Loading:
# Remove counter serialization for all entities to make loading faster.
# Ensure that counter tables works with unserialized data.
# Ensure display of counter data in entity tables.

Caching:
# Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
update values displayed in All Tasks table.
# Refreshing an entity, shouldn't change the respective record in any tables.
# Ensure all data displayed in a table are from a single time frame. (No part 
of the table should be partially updated by actions elsewhere)
# Data cached would be limited to one DAG at a time. (To limit the values 
stores in the browser side)

  was:
Loading:
# Remove counter serialization for all entities to make loading faster.
# Ensure that counter tables works with unserialized data.
# Ensure display of counter data in entity tables.

Caching:
# Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
update values displayed in All Tasks table.
# Refreshing an entity, shouldn't change the respective record in any tables.
# Ensure all data displayed in a table are from a single time frame. (Now part 
of the table should be partially updated by actions elsewhere)
# Values cached at a time for a specific entity type should belong to a single 
DAG. (To limit the values stores in the browser side)


> TEZ UI: Make data loading faster and caching better
> ---
>
> Key: TEZ-2275
> URL: https://issues.apache.org/jira/browse/TEZ-2275
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2275.1.patch
>
>
> Loading:
> # Remove counter serialization for all entities to make loading faster.
> # Ensure that counter tables works with unserialized data.
> # Ensure display of counter data in entity tables.
> Caching:
> # Separate cache for each table: Refreshing vertex->tasks shouldn't partially 
> update values displayed in All Tasks table.
> # Refreshing an entity, shouldn't change the respective record in any tables.
> # Ensure all data displayed in a table are from a single time frame. (No part 
> of the table should be partially updated by actions elsewhere)
> # Data cached would be limited to one DAG at a time. (To limit the values 
> stores in the browser side)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2259) Push additional data to Timeline for Recovery for better consumption in UI

2015-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2259:
-
Attachment: TEZ-2259.2.patch

> Push additional data to Timeline for Recovery for better consumption in UI
> --
>
> Key: TEZ-2259
> URL: https://issues.apache.org/jira/browse/TEZ-2259
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: TEZ-2259.1.patch, TEZ-2259.2.patch
>
>
> Some things I can think of: 
>  
>- applicationAttemptId in which the dag was submitted
>- appAttemptId in which the dag was completed 
> Above provides implicit information on how many app attempts the dag spanned 
> ( and therefore recovered how many times ).
>   
>- Maybe an implicit event mentioning that the DAG was recovered and in 
> which attempt it was recovered. Possibly add information on what state was 
> recovered?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2259) Push additional data to Timeline for Recovery for better consumption in UI

2015-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2259:
-
Attachment: TEZ-2259.3.patch

Comments addressed.

> Push additional data to Timeline for Recovery for better consumption in UI
> --
>
> Key: TEZ-2259
> URL: https://issues.apache.org/jira/browse/TEZ-2259
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: TEZ-2259.1.patch, TEZ-2259.2.patch, TEZ-2259.3.patch
>
>
> Some things I can think of: 
>  
>- applicationAttemptId in which the dag was submitted
>- appAttemptId in which the dag was completed 
> Above provides implicit information on how many app attempts the dag spanned 
> ( and therefore recovered how many times ).
>   
>- Maybe an implicit event mentioning that the DAG was recovered and in 
> which attempt it was recovered. Possibly add information on what state was 
> recovered?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2289) ATSHistoryLoggingService can generate ArrayOutOfBoundsException

2015-04-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490315#comment-14490315
 ] 

Hitesh Shah commented on TEZ-2289:
--

Committing shortly. 

> ATSHistoryLoggingService can generate ArrayOutOfBoundsException
> ---
>
> Key: TEZ-2289
> URL: https://issues.apache.org/jira/browse/TEZ-2289
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Chang Li
> Attachments: TEZ-2289.1.patch, TEZ-2289.patch
>
>
> 2015-04-07 23:11:20,459 INFO [main] app.DAGAppMaster: Running DAG: MRRSleepJob
> 2015-04-07 23:11:20,546 INFO [IPC Server handler 0 on 50500] ipc.Server: IPC 
> Server handler 0 on 50500, call 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.getDAGStatus 
> from 127.0.0.1:53151 Call#93 Retry#0
> org.apache.tez.dag.api.TezException: No running dag at present
> at 
> org.apache.tez.dag.api.client.DAGClientHandler.getDAG(DAGClientHandler.java:84)
> at 
> org.apache.tez.dag.api.client.DAGClientHandler.getACLManager(DAGClientHandler.java:151)
> at 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:94)
> at 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7375)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> 2015-04-07 23:11:20,875 INFO [main] history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1427297554817_0149_1][Event:DAG_SUBMITTED]: 
> dagID=dag_1427297554817_0149_1, submitTime=1428448280397
> 2015-04-07 23:11:20,905 WARN [HistoryEventHandlingThread] 
> ats.ATSHistoryLoggingService: Could not post history event to ATS, 
> atsPutError=6, entityId=dag_1427297554817_0149_1, eventType=DAG_SUBMITTED
> 2015-04-07 23:11:20,906 WARN [HistoryEventHandlingThread] 
> ats.ATSHistoryLoggingService: Could not handle history events
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.handleEvents(ATSHistoryLoggingService.java:312)
> at 
> org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.access$700(ATSHistoryLoggingService.java:50)
> at 
> org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService$1.run(ATSHistoryLoggingService.java:159)
> at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2226) Disable writing history to timeline if domain creation fails.

2015-04-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated TEZ-2226:
--
Attachment: TEZ-2226.5.patch

> Disable writing history to timeline if domain creation fails.
> -
>
> Key: TEZ-2226
> URL: https://issues.apache.org/jira/browse/TEZ-2226
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Chang Li
>Priority: Blocker
> Attachments: TEZ-2226.2.patch, TEZ-2226.3.patch, TEZ-2226.4.patch, 
> TEZ-2226.5.patch, TEZ-2226.patch, TEZ-2226.wip.2.patch, TEZ-2226.wip.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2226) Disable writing history to timeline if domain creation fails.

2015-04-10 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490330#comment-14490330
 ] 

Chang Li commented on TEZ-2226:
---

[~hitesh] Thanks a lot for suggesting ways of writing tests and insightful 
review! I just posted a work in progress patch to show the progress I have 
made. I have updated my patch to handle disabling logging on a per dag basis. I 
am just beginning to write unit tests, will try to deliver them soon.

> Disable writing history to timeline if domain creation fails.
> -
>
> Key: TEZ-2226
> URL: https://issues.apache.org/jira/browse/TEZ-2226
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Chang Li
>Priority: Blocker
> Attachments: TEZ-2226.2.patch, TEZ-2226.3.patch, TEZ-2226.4.patch, 
> TEZ-2226.5.patch, TEZ-2226.patch, TEZ-2226.wip.2.patch, TEZ-2226.wip.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2234 PreCommit Build #441

2015-04-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2234
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/441/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2794 lines...]
[INFO] Final Memory: 72M/1009M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724648/TEZ-2234.4.patch
  against master revision f382828.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/441//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/441//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
78615f154a92d99ed25dd72061dfe902db3b57de logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #435
Archived 44 artifacts
Archive block size is 32768
Received 19 blocks and 2118201 bytes
Compression is 22.7%
Took 0.83 sec
Description set: TEZ-2234
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490362#comment-14490362
 ] 

TezQA commented on TEZ-2234:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724648/TEZ-2234.4.patch
  against master revision f382828.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/441//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/441//console

This message is automatically generated.

> Allow vertex managers to get output size per source vertex
> --
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch, TEZ-2234.3.patch, 
> TEZ-2234.4.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2305) MR compatibility sleep job fails with IOException: Undefined job output-path

2015-04-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490382#comment-14490382
 ] 

Hitesh Shah commented on TEZ-2305:
--

Comments: 

Invalid import: import javax.xml.crypto.Data;
MR jobs when using yarn-tez mode do not use MROutputConfigBuilder. Plus they 
also use MROutputLegacy
The change in the exception trace in TestMROutputConfigBuilder makes it seem 
like we are changing current behavior that users might assume to be the default 
behavior if something is not specified.

I dont think the new tests should worry about mapper only or not. It mainly is 
a question  of how a user should configure MROutput when using it against 
mapred or mapreduce apis. And also, whether YARNRunner is doing the right thing 
when building the dag and configuring MROutputLegacy correctly.  


> MR compatibility sleep job fails with IOException: Undefined job output-path
> 
>
> Key: TEZ-2305
> URL: https://issues.apache.org/jira/browse/TEZ-2305
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Tassapol Athiapinya
>Priority: Critical
> Attachments: TEZ-2305-3.patch, TEZ-2305-4.patch, TEZ-2305.1.patch, 
> TEZ-2305.2.patch
>
>
> Running MR sleep job has an IOException.
> {code}
> 15/04/09 20:52:25 INFO mapreduce.Job: Job job_1428612196442_0002 failed with 
> state FAILED due to: Vertex failed, vertexName=initialmap, 
> vertexId=vertex_1428612196442_0002_1_00, diagnostics=[Task failed, 
> taskId=task_1428612196442_0002_1_00_01, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.io.IOException: 
> Undefined job output-path
>   at 
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:248)
>   at 
> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:121)
>   at 
> org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:401)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:436)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:415)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> ], TaskAttempt 1 failed, info=[Error: Failure while running 
> task:java.io.IOException: Undefined job output-path
>   at 
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:248)
>   at 
> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:121)
>   at 
> org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:401)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:436)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:415)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> ], TaskAttempt 2 failed, info=[Error: Failure while running 
> task:java.io.IOException: Undefined job output-path
>   at 
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:248)
>   at 
> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:121)
>   at 
> org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:401)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:436)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:415)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 

[jira] [Updated] (TEZ-2287) Deprecate VertexManagerPluginContext.getTaskContainer()

2015-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2287:
-
Attachment: TEZ-2287.1.patch

> Deprecate VertexManagerPluginContext.getTaskContainer()
> ---
>
> Key: TEZ-2287
> URL: https://issues.apache.org/jira/browse/TEZ-2287
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Attachments: TEZ-2287.1.patch
>
>
> This allows TEZ-2048



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2287) Deprecate VertexManagerPluginContext.getTaskContainer()

2015-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2287:
-
Target Version/s: 0.6.1

> Deprecate VertexManagerPluginContext.getTaskContainer()
> ---
>
> Key: TEZ-2287
> URL: https://issues.apache.org/jira/browse/TEZ-2287
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Attachments: TEZ-2287.1.patch
>
>
> This allows TEZ-2048



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2287) Deprecate VertexManagerPluginContext.getTaskContainer()

2015-04-10 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2287:
-
Fix Version/s: (was: 0.6.1)

> Deprecate VertexManagerPluginContext.getTaskContainer()
> ---
>
> Key: TEZ-2287
> URL: https://issues.apache.org/jira/browse/TEZ-2287
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Attachments: TEZ-2287.1.patch
>
>
> This allows TEZ-2048



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490392#comment-14490392
 ] 

Siddharth Seth commented on TEZ-2237:
-

The patch generates the correct number of empty events when an Output is not 
started. That essentially allows for an Output to be configured, but if it 
isn't used - the Output will just indicate to the system that nothing was 
generated.

The specific DAG that was hanging will not run through without the patch.

If you look at 
https://issues.apache.org/jira/secure/attachment/12724128/oneOutOfTwoOutputsStarted.txt
 - it show two configured Outputs, out of which only one is being started.

Either Cascading does not need the second output, or it may be losing data by 
not sending anything on this Output. That's something Cascading may want to 
look into.

I'm going to leave this discussion at this point, and finalize this patch over 
the next week or so.

> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> TEZ-2237.test.2_branch0.6.txt, all_stacks.lst, alloc_mem.png, 
> alloc_vcores.png, application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> application_1427964335235_2070.txt.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, noopexample_2237.txt, 
> oneOutOfTwoOutputsStarted.txt, ordered-grouped-kv-input-traces.diff, 
> output-starts.txt, start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2287) Deprecate VertexManagerPluginContext.getTaskContainer()

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490398#comment-14490398
 ] 

Bikas Saha commented on TEZ-2287:
-

Thanks! I was about to do this :P +1

> Deprecate VertexManagerPluginContext.getTaskContainer()
> ---
>
> Key: TEZ-2287
> URL: https://issues.apache.org/jira/browse/TEZ-2287
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Attachments: TEZ-2287.1.patch
>
>
> This allows TEZ-2048



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2308) Add set/get of record counts in task/vertex statistics

2015-04-10 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-2308:
---

 Summary: Add set/get of record counts in task/vertex statistics
 Key: TEZ-2308
 URL: https://issues.apache.org/jira/browse/TEZ-2308
 Project: Apache Tez
  Issue Type: Task
Reporter: Bikas Saha


In addition to data size, getting record count would be useful. /cc [~rohini]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490415#comment-14490415
 ] 

Bikas Saha commented on TEZ-2234:
-

Create TEZ-2308 to add record counts per Rohini's comments. Committing latest 
patch in a bit.

> Allow vertex managers to get output size per source vertex
> --
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch, TEZ-2234.3.patch, 
> TEZ-2234.4.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2259) Push additional data to Timeline for Recovery for better consumption in UI

2015-04-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490416#comment-14490416
 ] 

TezQA commented on TEZ-2259:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724654/TEZ-2259.3.patch
  against master revision c8ef244.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in :
 org.apache.tez.test.TestSecureShuffle
org.apache.tez.test.TestFaultTolerance

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/442//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/442//console

This message is automatically generated.

> Push additional data to Timeline for Recovery for better consumption in UI
> --
>
> Key: TEZ-2259
> URL: https://issues.apache.org/jira/browse/TEZ-2259
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: TEZ-2259.1.patch, TEZ-2259.2.patch, TEZ-2259.3.patch
>
>
> Some things I can think of: 
>  
>- applicationAttemptId in which the dag was submitted
>- appAttemptId in which the dag was completed 
> Above provides implicit information on how many app attempts the dag spanned 
> ( and therefore recovered how many times ).
>   
>- Maybe an implicit event mentioning that the DAG was recovered and in 
> which attempt it was recovered. Possibly add information on what state was 
> recovered?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2259 PreCommit Build #442

2015-04-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2259
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/442/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2513 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724654/TEZ-2259.3.patch
  against master revision c8ef244.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in :
 org.apache.tez.test.TestSecureShuffle
org.apache.tez.test.TestFaultTolerance

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/442//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/442//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
3c5a26e8e7a68a226c98ac9a04eb7e531cc08427 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #441
Archived 44 artifacts
Archive block size is 32768
Received 21 blocks and 2030969 bytes
Compression is 25.3%
Took 1.4 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

Failed: TEZ-2226 PreCommit Build #443

2015-04-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2226
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/443/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2773 lines...]


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724655/TEZ-2226.5.patch
  against master revision c8ef244.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/443//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/443//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/443//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
5bf87466376154b052d01ebc5ceef82c23abf64c logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #441
Archived 45 artifacts
Archive block size is 32768
Received 2 blocks and 2519406 bytes
Compression is 2.5%
Took 1.1 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2226) Disable writing history to timeline if domain creation fails.

2015-04-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490431#comment-14490431
 ] 

TezQA commented on TEZ-2226:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724655/TEZ-2226.5.patch
  against master revision c8ef244.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/443//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/443//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/443//console

This message is automatically generated.

> Disable writing history to timeline if domain creation fails.
> -
>
> Key: TEZ-2226
> URL: https://issues.apache.org/jira/browse/TEZ-2226
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Chang Li
>Priority: Blocker
> Attachments: TEZ-2226.2.patch, TEZ-2226.3.patch, TEZ-2226.4.patch, 
> TEZ-2226.5.patch, TEZ-2226.patch, TEZ-2226.wip.2.patch, TEZ-2226.wip.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490455#comment-14490455
 ] 

Cyrille Chépélov edited comment on TEZ-2237 at 4/10/15 10:17 PM:
-

Just to make sure I'm following everything, what happens if one of the inputs 
end up being empty (for any reason, such as one of the intermediate inputs is 
the result of a {code}val p2 = p1.filter(_ == false){code} operation. Would 
that fall within the situation the patch handles? 

Sounds like good news [~sseth]. Hopefully a tez-0.6.1 release can be planned 
soon after next week.




was (Author: cchepelov):
Just to make sure I'm following everything, what happens if one of the inputs 
end up being empty (for any reason, such as one of the intermediate inputs is 
the result of a {code}.filter(_ == false){code} operation. Would that fall 
within the situation the patch handles? 

Sound like good news [~sseth]. Hopefully a tez-0.6.1 release can be planned 
soon after next week.



> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> TEZ-2237.test.2_branch0.6.txt, all_stacks.lst, alloc_mem.png, 
> alloc_vcores.png, application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> application_1427964335235_2070.txt.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, noopexample_2237.txt, 
> oneOutOfTwoOutputsStarted.txt, ordered-grouped-kv-input-traces.diff, 
> output-starts.txt, start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490455#comment-14490455
 ] 

Cyrille Chépélov commented on TEZ-2237:
---

Just to make sure I'm following everything, what happens if one of the inputs 
end up being empty (for any reason, such as one of the intermediate inputs is 
the result of a {code}.filter(_ == false){code} operation. Would that fall 
within the situation the patch handles? 

Sound like good news [~sseth]. Hopefully a tez-0.6.1 release can be planned 
soon after next week.



> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> TEZ-2237.test.2_branch0.6.txt, all_stacks.lst, alloc_mem.png, 
> alloc_vcores.png, application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> application_1427964335235_2070.txt.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, noopexample_2237.txt, 
> oneOutOfTwoOutputsStarted.txt, ordered-grouped-kv-input-traces.diff, 
> output-starts.txt, start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-145) Support a combiner processor that can run non-local to map/reduce nodes

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490468#comment-14490468
 ] 

Bikas Saha commented on TEZ-145:


Taking a step back, lets figure out the scenarios for this. 
Do we agree that for small jobs (small data) - this is not going to be helpful 
because we will be adding an extra stage latency for small combiner benefits.
Large job (large data) with no data reduction in the map side combiner - this 
is not going to be helpful because the extra combiner will not reduce the data 
further.
Large job (large data) with high data reduction in the map side combiner - this 
is going to be useful because the extra combiner will reduce the data further 
and also decrease the number of data shards by aggregating small outputs from 
the map tasks into smaller number of combiner tasks.
Large job (large data) with lot of filtering (no combiner) - this may be 
useful, not because their is a combine operation) but to reduce the large 
number of small outputs produced by the map tasks into a smaller number of 
shards due to the combiner tasks.

> Support a combiner processor that can run non-local to map/reduce nodes
> ---
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-145.2.patch, WIP-TEZ-145-001.patch
>
>
> For aggregate operators that can benefit by running in multi-level trees, 
> support of being able to run a combiner in a non-local mode would allow 
> performance efficiencies to be gained by running a combiner at a rack-level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2287) Deprecate VertexManagerPluginContext.getTaskContainer()

2015-04-10 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490471#comment-14490471
 ] 

TezQA commented on TEZ-2287:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724667/TEZ-2287.1.patch
  against master revision c8ef244.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 176 javac 
compiler warnings (more than the master's current 175 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/444//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/444//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/444//console

This message is automatically generated.

> Deprecate VertexManagerPluginContext.getTaskContainer()
> ---
>
> Key: TEZ-2287
> URL: https://issues.apache.org/jira/browse/TEZ-2287
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Blocker
> Attachments: TEZ-2287.1.patch
>
>
> This allows TEZ-2048



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2287 PreCommit Build #444

2015-04-10 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2287
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/444/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2766 lines...]


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12724667/TEZ-2287.1.patch
  against master revision c8ef244.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 176 javac 
compiler warnings (more than the master's current 175 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/444//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/444//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/444//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
2e126cc09446d1d892b546faba64e4c310a11bd1 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #441
Archived 45 artifacts
Archive block size is 32768
Received 24 blocks and 1953688 bytes
Compression is 28.7%
Took 1.3 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Comment Edited] (TEZ-145) Support a combiner processor that can run non-local to map/reduce nodes

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490468#comment-14490468
 ] 

Bikas Saha edited comment on TEZ-145 at 4/10/15 10:30 PM:
--

Taking a step back, lets figure out the scenarios for this. 
Do we agree that 
1) Small jobs (small data) - this is not going to be helpful because we will be 
adding an extra stage latency for small combiner benefits.
2) Large job (large data) with no data reduction in the map side combiner - 
this is not going to be helpful because the extra combiner will not reduce the 
data further.
3) Large job (large data) with high data reduction in the map side combiner - 
this is going to be useful because the extra combiner will reduce the data 
further and also decrease the number of data shards by aggregating small 
outputs from the map tasks into smaller number of combiner tasks.
4) Large job (large data) with lot of filtering (no combiner) - this may be 
useful, not because their is a combine operation) but to reduce the large 
number of small outputs produced by the map tasks into a smaller number of 
shards due to the combiner tasks.

For 3/4 this may be useful if we can run aggregation combiner tasks at the rack 
level to coalesce the data within a rack (cheap) compared to having to pull 
that data across racks in the final reducer. Even in these cases, given better 
networks, we need to understand the trade off between pulling the data across 
to the final reducer vs the cost of running the extra combiner stage. 
Essentially, what is the killer scenario for this?


was (Author: bikassaha):
Taking a step back, lets figure out the scenarios for this. 
Do we agree that for small jobs (small data) - this is not going to be helpful 
because we will be adding an extra stage latency for small combiner benefits.
Large job (large data) with no data reduction in the map side combiner - this 
is not going to be helpful because the extra combiner will not reduce the data 
further.
Large job (large data) with high data reduction in the map side combiner - this 
is going to be useful because the extra combiner will reduce the data further 
and also decrease the number of data shards by aggregating small outputs from 
the map tasks into smaller number of combiner tasks.
Large job (large data) with lot of filtering (no combiner) - this may be 
useful, not because their is a combine operation) but to reduce the large 
number of small outputs produced by the map tasks into a smaller number of 
shards due to the combiner tasks.

> Support a combiner processor that can run non-local to map/reduce nodes
> ---
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-145.2.patch, WIP-TEZ-145-001.patch
>
>
> For aggregate operators that can benefit by running in multi-level trees, 
> support of being able to run a combiner in a non-local mode would allow 
> performance efficiencies to be gained by running a combiner at a rack-level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-145) Support a combiner processor that can run non-local to map/reduce nodes

2015-04-10 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490535#comment-14490535
 ] 

Gopal V commented on TEZ-145:
-

Combiner have only made sense in case of 3/4. 

#3 is the true use case for this, because combiners are written only for those 
scenarios.

The old MRv2 model did a re-merge + combine() only if there were > 3 spills per 
task.

So tuning it to have no extra spills produced bad shuffle performance, which is 
what the Tez approach is not vulnerable to, since it is meant to combine 
host-local data (plus skip merges via pipelining).

The original scenario where I discovered a need for this was when I was trying 
to find the first/last transaction of sessions across a time window, to look 
for overlapped session-ids for the same user to detect multiple device usage or 
stolen tokens.

> Support a combiner processor that can run non-local to map/reduce nodes
> ---
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-145.2.patch, WIP-TEZ-145-001.patch
>
>
> For aggregate operators that can benefit by running in multi-level trees, 
> support of being able to run a combiner in a non-local mode would allow 
> performance efficiencies to be gained by running a combiner at a rack-level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2309) Fix slf4j dependencies for tez modules

2015-04-10 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2309:


 Summary: Fix slf4j dependencies for tez modules 
 Key: TEZ-2309
 URL: https://issues.apache.org/jira/browse/TEZ-2309
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Critical


Most modules should depend only on slf4j-api and not slf4j-log4j12. 

I believe only tez-dag and tez-runtime-internals might need the log4j 
dependency due to log rotation related code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-145) Support a combiner processor that can run non-local to map/reduce nodes

2015-04-10 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490647#comment-14490647
 ] 

Tsuyoshi Ozawa commented on TEZ-145:


[~bikassaha] [~gopalv] As Gopal mentioned, this feature can target 3 and 4. 
This is a benchmark result of prototype of MAPREDCE-4502: 
http://www.slideshare.net/ozax86/prestrata-hadoop-word-meetup/11
On MAPREDUCE-4502, I tried to run combiner after spilling tasks: it causes 
performance trade off between aggregation ratio vs disk IO. So, Gopal's comment 
as follows makes sense to me.

{quote}
So tuning it to have no extra spills produced bad shuffle performance, which is 
what the Tez approach is not vulnerable to, since it is meant to combine 
host-local data (plus skip merges via pipelining).
{quote}

If we can implement in-memory combiner or such kind of DAG support in Tez 
layer, we can improve performance more. However, we need to change the 
semantics of fault tolerance. 

> Support a combiner processor that can run non-local to map/reduce nodes
> ---
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-145.2.patch, WIP-TEZ-145-001.patch
>
>
> For aggregate operators that can benefit by running in multi-level trees, 
> support of being able to run a combiner in a non-local mode would allow 
> performance efficiencies to be gained by running a combiner at a rack-level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-145) Support a combiner processor that can run non-local to map/reduce nodes

2015-04-10 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490647#comment-14490647
 ] 

Tsuyoshi Ozawa edited comment on TEZ-145 at 4/11/15 1:08 AM:
-

[~bikassaha] [~gopalv] As Gopal mentioned, this feature can target 3 and 4. 
This is a benchmark result of prototype of MAPREDCE-4502: 
http://www.slideshare.net/ozax86/prestrata-hadoop-word-meetup/11
On MAPREDUCE-4502, I tried to run combiner after spilling tasks: it causes 
performance trade off between aggregation ratio vs disk IO. So, Gopal's comment 
as follows makes sense to me.

{quote}
So tuning it to have no extra spills produced bad shuffle performance, which is 
what the Tez approach is not vulnerable to, since it is meant to combine 
host-local data (plus skip merges via pipelining).
{quote}

If we can implement in-memory combiner or such kind of DAG support in Tez 
layer, we can improve performance more. However, we need to change the 
semantics of fault tolerance to support the feature since fault tolerance won't 
be task-level in this case. 


was (Author: ozawa):
[~bikassaha] [~gopalv] As Gopal mentioned, this feature can target 3 and 4. 
This is a benchmark result of prototype of MAPREDCE-4502: 
http://www.slideshare.net/ozax86/prestrata-hadoop-word-meetup/11
On MAPREDUCE-4502, I tried to run combiner after spilling tasks: it causes 
performance trade off between aggregation ratio vs disk IO. So, Gopal's comment 
as follows makes sense to me.

{quote}
So tuning it to have no extra spills produced bad shuffle performance, which is 
what the Tez approach is not vulnerable to, since it is meant to combine 
host-local data (plus skip merges via pipelining).
{quote}

If we can implement in-memory combiner or such kind of DAG support in Tez 
layer, we can improve performance more. However, we need to change the 
semantics of fault tolerance. 

> Support a combiner processor that can run non-local to map/reduce nodes
> ---
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-145.2.patch, WIP-TEZ-145-001.patch
>
>
> For aggregate operators that can benefit by running in multi-level trees, 
> support of being able to run a combiner in a non-local mode would allow 
> performance efficiencies to be gained by running a combiner at a rack-level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Add API for statistics information - allow vertex managers to get output size per source vertex

2015-04-10 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Summary: Add API for statistics information - allow vertex managers to get 
output size per source vertex  (was: Allow vertex managers to get output size 
per source vertex)

> Add API for statistics information - allow vertex managers to get output size 
> per source vertex
> ---
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch, TEZ-2234.2.patch, TEZ-2234.3.patch, 
> TEZ-2234.4.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-145) Support a combiner processor that can run non-local to map/reduce nodes

2015-04-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490817#comment-14490817
 ] 

Bikas Saha commented on TEZ-145:


What I am describing is the concept of partial aggregation (see figure 7 in 
http://research.microsoft.com/pubs/63785/eurosys07.pdf) in which applying a 
combiner becomes a special case that may result in further data reduction 
depending on the combine function. In the degenerate case the combine function 
is the concatenation function which simply creates a smaller number of large 
sized chunks from a large number of small sized chunks within cheaper network 
domains.

> Support a combiner processor that can run non-local to map/reduce nodes
> ---
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-145.2.patch, WIP-TEZ-145-001.patch
>
>
> For aggregate operators that can benefit by running in multi-level trees, 
> support of being able to run a combiner in a non-local mode would allow 
> performance efficiencies to be gained by running a combiner at a rack-level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)