[jira] [Commented] (TEZ-2196) Consider reusing UnorderedPartitionedKVWriter with single output in UnorderedKVOutput

2015-03-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381458#comment-14381458
 ] 

Hadoop QA commented on TEZ-2196:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707434/TEZ-2196.4.patch
  against master revision 2fe2d63.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/349//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/349//console

This message is automatically generated.

> Consider reusing UnorderedPartitionedKVWriter with single output in 
> UnorderedKVOutput
> -
>
> Key: TEZ-2196
> URL: https://issues.apache.org/jira/browse/TEZ-2196
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2196.1.patch, TEZ-2196.2.patch, TEZ-2196.3.patch, 
> TEZ-2196.4.patch
>
>
> Can possibly get rid of FileBasedKVWriter and reuse 
> UnorderedPartitionedKVWriter with single partition in UnorderedKVOutput.  
> This can also benefit from pipelined shuffle changes done in 
> UnorderedPartitionedKVWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2196 PreCommit Build #349

2015-03-25 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2196
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/349/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2762 lines...]
[INFO] Final Memory: 73M/970M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707434/TEZ-2196.4.patch
  against master revision 2fe2d63.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/349//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/349//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
3a56c319caec4fc6845e2093c40600ce5df14ef3 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #343
Archived 44 artifacts
Archive block size is 32768
Received 2 blocks and 2656212 bytes
Compression is 2.4%
Took 2 sec
Description set: TEZ-2196
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-2213) For the ordered case, enabling pipelined shuffle should automatically disable final merge

2015-03-25 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2213:
--
Attachment: TEZ-2213.1.patch

[~sseth] - Plz review when you find sometime.

> For the ordered case, enabling pipelined shuffle should automatically disable 
> final merge
> -
>
> Key: TEZ-2213
> URL: https://issues.apache.org/jira/browse/TEZ-2213
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2213.1.patch
>
>
> Currently, it ends up throwing an exception. Given the defaults - enabling 
> pipelined shuffle requires two parameters to be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2213) For the ordered case, enabling pipelined shuffle should automatically disable final merge

2015-03-25 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned TEZ-2213:
-

Assignee: Rajesh Balamohan

> For the ordered case, enabling pipelined shuffle should automatically disable 
> final merge
> -
>
> Key: TEZ-2213
> URL: https://issues.apache.org/jira/browse/TEZ-2213
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Rajesh Balamohan
>
> Currently, it ends up throwing an exception. Given the defaults - enabling 
> pipelined shuffle requires two parameters to be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2196) Consider reusing UnorderedPartitionedKVWriter with single output in UnorderedKVOutput

2015-03-25 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2196:
--
Attachment: TEZ-2196.4.patch

Thanks [~sseth].  Addressed it in the latest patch.  Will commit it shortly 
after pre-commit build passes.

> Consider reusing UnorderedPartitionedKVWriter with single output in 
> UnorderedKVOutput
> -
>
> Key: TEZ-2196
> URL: https://issues.apache.org/jira/browse/TEZ-2196
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2196.1.patch, TEZ-2196.2.patch, TEZ-2196.3.patch, 
> TEZ-2196.4.patch
>
>
> Can possibly get rid of FileBasedKVWriter and reuse 
> UnorderedPartitionedKVWriter with single partition in UnorderedKVOutput.  
> This can also benefit from pipelined shuffle changes done in 
> UnorderedPartitionedKVWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2235) Tasks throwing OOM before reaching memory limits

2015-03-25 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381382#comment-14381382
 ] 

Rajesh Balamohan commented on TEZ-2235:
---

Looks more of hive issue.  With hive "commit 
4e185e7f8a760444aac8117d0088bbd8baa65a6a", it works fine.  Will try to find out 
the issue and move this to hive jira. 

> Tasks throwing OOM before reaching memory limits
> 
>
> Key: TEZ-2235
> URL: https://issues.apache.org/jira/browse/TEZ-2235
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: Screen Shot 2015-03-26 at 5.04.46 AM.png, Screen Shot 
> 2015-03-26 at 5.05.06 AM.png
>
>
> - Ran query13 in tpcds with hive (1.2.0-SNAPSHOT) at 10 TB scale with Tez 
> (0.7 master)
> - tez.runtime.io.sort.mb=1800 on 4 GB container.
> - OOM was thrown in lots of tasks when allocating memory to sorter.  
> - Heapdump reveals memory allocated to sorter.  And other objects do not take 
> up that much space.
> Need more investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

2015-03-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381332#comment-14381332
 ] 

Gopal V commented on TEZ-2217:
--

[~bikassaha]: The patch keeps containers alive, which works better with this 
patch.

There's a lot of log-spew with {{LOG.info("Holding onto idle container with no 
work. CId: "}} in the _post log files.

I might take a couple of days to reviewing this, so If [~rajesh.balamohan] can 
spare some time to review this, we can get this in quickly.

> The min-held-containers constraint is not enforced during query runtime 
> 
>
> Key: TEZ-2217
> URL: https://issues.apache.org/jira/browse/TEZ-2217
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Gopal V
>Assignee: Bikas Saha
> Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
> TEZ-2217.2.patch, TEZ-2217.3.patch, TEZ-2217.txt.bz2
>
>
> The min-held containers constraint is respected during query idle times, but 
> is not respected when a query is actually in motion.
> The AM releases unused containers during dag execution without checking for 
> min-held containers.
> {code}
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
> container, containerId=container_1424502260528_1348_01_13, 
> containerExpiryTime=1426891313264, idleTimeoutMin=5000
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Releasing unused container: 
> container_1424502260528_1348_01_13
> {code}
> This is actually useful only after the AM has received a soft pre-emption 
> message, doing it on an idle cluster slows down one of the most common query 
> patterns in BI systems.
> {code}
> create temporary table smalltable as ...; 
> select ... bigtable JOIN smalltable ON ...;
> {code}
> The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381238#comment-14381238
 ] 

Bikas Saha commented on TEZ-714:


typo
{code}
+  private AtomicBoolean commitCancled = new AtomicBoolean(false);
   boolean commitAllOutputsOnSuccess = true;
{code}

Most of these should not be ignored because there is a bug is any of these 
events actually come in during commit. Maybe except vertex manager user code 
error, which can be ignored.
{code}   .addTransition(VertexState.COMMITTING, VertexState.COMMITTING,
  EnumSet.of(
  VertexEventType.V_MANAGER_USER_CODE_ERROR,
  VertexEventType.V_ROOT_INPUT_FAILED,
  VertexEventType.V_SOURCE_VERTEX_STARTED,
  VertexEventType.V_ROOT_INPUT_INITIALIZED,
  VertexEventType.V_NULL_EDGE_INITIALIZED,
  VertexEventType.V_SOURCE_TASK_ATTEMPT_COMPLETED,
  VertexEventType.V_TASK_ATTEMPT_COMPLETED)){code}

Why is this now public?
{code}  public void abortVertex(final VertexStatus.State finalState) {
{code}

Where is abort being called on all outputs when the vertex/dag fails (failure 
could be in commit operation or due to external cause). Should we wait for all 
outstanding commit operations to get cancelled or complete and then call abort 
on all outputs?

Why is this calling Vertex.abortVertex() instead of directly calling 
committer.abort() for the outputs?
{code}if (commitAllOutputsOnSuccess) {
  for (Vertex vertex : vertices.values()) {
((VertexImpl)vertex).abortVertex(VertexStatus.State.FAILED);
  }{code}

Why has calling commit operations moved from DAG.finished() to 
DAG.checkForCompletion()? finished() is expected to be called once but 
checkForCompletion can be called any number of times. finished() may need to be 
broken into 2 methods though to separate the parts which should happen after 
commits are done.

In OutputKey there vertexName and groupVertexName can be merged so make the 
code paths similar. Where needed indicating group can be done via a boolean.
{code}  for (Map.Entry> entry : 
commitFutures.entrySet()) {
OutputKey outputKey = entry.getKey();
if (outputKey.vertexGroupName != null) {
  LOG.info("Canceling commit of output:" + outputKey.getOutputName()
  + " of vertex group:" + outputKey.vertexGroupName);
} else {
  LOG.info("Canceling commit of output:" + outputKey.getOutputName()
  + " of vertex:" + outputKey.vertexName);
}{code}

should this be private if its accessed by derived classes? Is 
CommitCompletedTransition used in the state machine? If not, then it does not 
need to be a transition.class.
{code}// either commitFail or recoveryFail
private boolean isFail = false;{code}

Why is this directly sending events instead of using a common method?
{code} if (super.isFail) {
for (Vertex vertex : dag.vertices.values()) {
  ((VertexImpl)vertex).handle(new 
VertexEventTermination(vertex.getVertexId(),
  VertexTerminationCause.OTHER_VERTEX_FAILURE));
}
return DAGState.TERMINATING;
  {code}

Why is there no check for whether there are non-zero committers?
{code}  private synchronized DAGState commitOrFinish() {
if (this.committed) {
  LOG.info("Ignoring multiple output commit/abort");
  if (commitFutures.isEmpty() && terminationCause == null) {
return finished(DAGState.SUCCEEDED);
  } else {
return getState();
  }
}
LOG.info("Calling DAG commit for dag: " + getID());
this.committed = true;

// commit all shared outputs
try {
  appContext.getHistoryHandler().handleCriticalEvent(new 
DAGHistoryEvent(getID(),
  new DAGCommitStartedEvent(getID(), clock.getTime(;
} catch (IOException e) {
  LOG.error("Failed to send commit event to history/recovery handler", e);
  trySetTerminationCause(DAGTerminationCause.RECOVERY_FAILURE);
  return DAGState.FAILED;
{code}

Thanks for incorporating the suggestions about the flow. The new code is much 
simpler, though there may be some issues that may need ironing out if the above 
comments are valid.

Haven't seen the tests yet. 


> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, TEZ-714-4.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run i

[jira] [Commented] (TEZ-2103) Implement a Partial completion VertexManagerPlugin

2015-03-25 Thread Alok Asok (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381196#comment-14381196
 ] 

Alok Asok commented on TEZ-2103:


Hi 

So I had a doubt regarding this Short circuit mechanism. Does the Vertex 
manager keep checking the state of the application through heartbeats till the 
limit condition is met?
If so does it send some specially structured message to the scheduler to close 
the rest of the sibling task and set their flag a success? How is this ordering 
done exactly? I was going in through the Tez native umbilical communication 
protocol and didnt know where to look for specifics.

Thanks
Alok Asok

> Implement a Partial completion VertexManagerPlugin
> --
>
> Key: TEZ-2103
> URL: https://issues.apache.org/jira/browse/TEZ-2103
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Gopal V
>  Labels: gsoc, gsoc2015, hadoop, java, tez
>
> Currently, there is no sibling communication between tasks - this implies 
> that a task can be completed by the first vertex in a wave of tasks, but the 
> entire wave of tasks has to complete before success can be reported.
> This occurs in limit + filter query patterns common between the data access 
> engines.
> {code}
> select * from data where x > 1 limit 10;
> {code}
> will run through a full-table scan worth of tasks to generate 10 rows per 
> task, to aggregate it to produce the final 10 row result.
> The VertexManager receives counters/events early enough to short-circuit the 
> rest of the vertex tasks, to prevent the remainder of tasks from getting 
> scheduled when the limit condition has been satisfied by an initial sub-set 
> of the tasks.
> This is a specialization of the VertexManagerPlugin for this common case 
> scheduling pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2229) bower ESUDO Cannot be run with sudo -- during build

2015-03-25 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381159#comment-14381159
 ] 

Fengdong Yu commented on TEZ-2229:
--

hi [~pramachandran],  can we add some text in tez-ui/README or add some code in 
maven plugin to regcognize the current user.


> bower ESUDO Cannot be run with sudo -- during build
> ---
>
> Key: TEZ-2229
> URL: https://issues.apache.org/jira/browse/TEZ-2229
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Linux x86_64 
>Reporter: Fengdong Yu
>
> I build Tez using root, I never install node/npm locally before my build.
> then there are exception messages during build tez-ui module. Maven debug 
> logs:
> {code}
> [DEBUG] env: SSH_TTY=/dev/pts/0
> [DEBUG] env: TERM=xterm
> [DEBUG] env: USER=root
> [DEBUG] env: XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
> [DEBUG] Toolchains are ignored, 'executable' parameter is set to 
> /root/temp/apache-tez-0.6.0-src/tez-ui/src/main/webapp/node/node
> [DEBUG] Executing command line: 
> [/root/temp/apache-tez-0.6.0-src/tez-ui/src/main/webapp/node/node, 
> node_modules/bower/bin/bower, install, --remove-unnecessary-resolutions=false]
> bower ESUDO Cannot be run with sudo
> Additional error details:
> Since bower is a user command, there is no need to execute it with superuser 
> permissions.
> If you're having permission errors when using bower without sudo, please 
> spend a few minutes learning more about how your system should work and make 
> any necessary repairs.
> http://www.joyent.com/blog/installing-node-and-npm
> https://gist.github.com/isaacs/579814
> You can however run a command with sudo using --allow-root option
> {code}
> {code}
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec 
> (Bower install) on project tez-ui: Command execution failed. Process exited 
> with an error: 1 (Exit value: 1) -> [
> Help 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
> execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec (Bower install) 
> on project tez-ui: Command execution failed.
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:216)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: Command execution 
> failed.
>   at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:303)
>   at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
>   ... 19 more
> Caused by: org.apache.commons.exec.ExecuteException: Process exited with an 
> error: 1 (Exit value: 1)
>   at 
> org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:402)
>   at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:164)
>   at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:746)
>   at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:292)
>   ... 21 more
> [ERROR] 
> [ERROR] 
> [ERROR] For more 

[jira] [Commented] (TEZ-2235) Tasks throwing OOM before reaching memory limits

2015-03-25 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381110#comment-14381110
 ] 

Rajesh Balamohan commented on TEZ-2235:
---

Haven't checked on the other branches. Will check it soon.

> Tasks throwing OOM before reaching memory limits
> 
>
> Key: TEZ-2235
> URL: https://issues.apache.org/jira/browse/TEZ-2235
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: Screen Shot 2015-03-26 at 5.04.46 AM.png, Screen Shot 
> 2015-03-26 at 5.05.06 AM.png
>
>
> - Ran query13 in tpcds with hive (1.2.0-SNAPSHOT) at 10 TB scale with Tez 
> (0.7 master)
> - tez.runtime.io.sort.mb=1800 on 4 GB container.
> - OOM was thrown in lots of tasks when allocating memory to sorter.  
> - Heapdump reveals memory allocated to sorter.  And other objects do not take 
> up that much space.
> Need more investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2235) Tasks throwing OOM before reaching memory limits

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381107#comment-14381107
 ] 

Hitesh Shah commented on TEZ-2235:
--

Is this only with master or also in 0.5/0.6 branches?

> Tasks throwing OOM before reaching memory limits
> 
>
> Key: TEZ-2235
> URL: https://issues.apache.org/jira/browse/TEZ-2235
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: Screen Shot 2015-03-26 at 5.04.46 AM.png, Screen Shot 
> 2015-03-26 at 5.05.06 AM.png
>
>
> - Ran query13 in tpcds with hive (1.2.0-SNAPSHOT) at 10 TB scale with Tez 
> (0.7 master)
> - tez.runtime.io.sort.mb=1800 on 4 GB container.
> - OOM was thrown in lots of tasks when allocating memory to sorter.  
> - Heapdump reveals memory allocated to sorter.  And other objects do not take 
> up that much space.
> Need more investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2235) Tasks throwing OOM before reaching memory limits

2015-03-25 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2235:
--
Attachment: Screen Shot 2015-03-26 at 5.05.06 AM.png
Screen Shot 2015-03-26 at 5.04.46 AM.png

> Tasks throwing OOM before reaching memory limits
> 
>
> Key: TEZ-2235
> URL: https://issues.apache.org/jira/browse/TEZ-2235
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: Screen Shot 2015-03-26 at 5.04.46 AM.png, Screen Shot 
> 2015-03-26 at 5.05.06 AM.png
>
>
> - Ran query13 in tpcds with hive (1.2.0-SNAPSHOT) at 10 TB scale with Tez 
> (0.7 master)
> - tez.runtime.io.sort.mb=1800 on 4 GB container.
> - OOM was thrown in lots of tasks when allocating memory to sorter.  
> - Heapdump reveals memory allocated to sorter.  And other objects do not take 
> up that much space.
> Need more investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2235) Tasks throwing OOM before reaching memory limits

2015-03-25 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created TEZ-2235:
-

 Summary: Tasks throwing OOM before reaching memory limits
 Key: TEZ-2235
 URL: https://issues.apache.org/jira/browse/TEZ-2235
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan


- Ran query13 in tpcds with hive (1.2.0-SNAPSHOT) at 10 TB scale with Tez (0.7 
master)
- tez.runtime.io.sort.mb=1800 on 4 GB container.
- OOM was thrown in lots of tasks when allocating memory to sorter.  
- Heapdump reveals memory allocated to sorter.  And other objects do not take 
up that much space.

Need more investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381005#comment-14381005
 ] 

Hitesh Shah commented on TEZ-2214:
--

Updated fix versions.

> FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
> memToDiskMerging
> --
>
> Key: TEZ-2214
> URL: https://issues.apache.org/jira/browse/TEZ-2214
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Fix For: 0.5.4, 0.6.1
>
> Attachments: TEZ-2214.1.patch, TEZ-2214.2.patch, TEZ-2214.3.patch
>
>
> Scenario:
> - commitMemory & usedMemory are beyond their allowed threshold.
> - InMemoryMerge kicks off and is in the process of flushing memory contents 
> to disk
> - As it progresses, it releases memory segments as well (but not yet over).
> - Fetchers who need memory < maxSingleShuffleLimit, get scheduled.
> - If fetchers are fast, this quickly adds up to commitMemory & usedMemory. 
> Since InMemoryMerge is already in progress, this wouldn't trigger another 
> merge().
> - Pretty soon all fetchers would be stalled and get into the following state.
> {noformat}
> Thread 9351: (state = BLOCKED)
>  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
> imprecise)
>  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
>  @bci=17, line=337 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
>  @bci=34, line=157 (Interpreted frame)
> {noformat}
> - Even if InMemoryMerger completes, "commitedMem & usedMem" are beyond their 
> threshold and no other fetcher threads (all are in stalled state) are there 
> to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging

2015-03-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2214:
-
Fix Version/s: 0.6.1
   0.5.4

> FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
> memToDiskMerging
> --
>
> Key: TEZ-2214
> URL: https://issues.apache.org/jira/browse/TEZ-2214
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Fix For: 0.5.4, 0.6.1
>
> Attachments: TEZ-2214.1.patch, TEZ-2214.2.patch, TEZ-2214.3.patch
>
>
> Scenario:
> - commitMemory & usedMemory are beyond their allowed threshold.
> - InMemoryMerge kicks off and is in the process of flushing memory contents 
> to disk
> - As it progresses, it releases memory segments as well (but not yet over).
> - Fetchers who need memory < maxSingleShuffleLimit, get scheduled.
> - If fetchers are fast, this quickly adds up to commitMemory & usedMemory. 
> Since InMemoryMerge is already in progress, this wouldn't trigger another 
> merge().
> - Pretty soon all fetchers would be stalled and get into the following state.
> {noformat}
> Thread 9351: (state = BLOCKED)
>  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
> imprecise)
>  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
>  @bci=17, line=337 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
>  @bci=34, line=157 (Interpreted frame)
> {noformat}
> - Even if InMemoryMerger completes, "commitedMem & usedMem" are beyond their 
> threshold and no other fetcher threads (all are in stalled state) are there 
> to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging

2015-03-25 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380980#comment-14380980
 ] 

Rajesh Balamohan edited comment on TEZ-2214 at 3/25/15 11:01 PM:
-

Thanks [~sseth], [~hitesh]. Committed .3 version of patch to master, branch-0.6 
and branch-0.5

>>
commit 2fe2d63529b3fb420c15d4be6bbf50d501edb626
>>



was (Author: rajesh.balamohan):
Thanks [~sseth], [~hitesh]. Committed to master, branch-0.6 and branch-0.5

>>
commit 2fe2d63529b3fb420c15d4be6bbf50d501edb626
>>


> FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
> memToDiskMerging
> --
>
> Key: TEZ-2214
> URL: https://issues.apache.org/jira/browse/TEZ-2214
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2214.1.patch, TEZ-2214.2.patch, TEZ-2214.3.patch
>
>
> Scenario:
> - commitMemory & usedMemory are beyond their allowed threshold.
> - InMemoryMerge kicks off and is in the process of flushing memory contents 
> to disk
> - As it progresses, it releases memory segments as well (but not yet over).
> - Fetchers who need memory < maxSingleShuffleLimit, get scheduled.
> - If fetchers are fast, this quickly adds up to commitMemory & usedMemory. 
> Since InMemoryMerge is already in progress, this wouldn't trigger another 
> merge().
> - Pretty soon all fetchers would be stalled and get into the following state.
> {noformat}
> Thread 9351: (state = BLOCKED)
>  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
> imprecise)
>  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
>  @bci=17, line=337 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
>  @bci=34, line=157 (Interpreted frame)
> {noformat}
> - Even if InMemoryMerger completes, "commitedMem & usedMem" are beyond their 
> threshold and no other fetcher threads (all are in stalled state) are there 
> to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging

2015-03-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380911#comment-14380911
 ] 

Siddharth Seth commented on TEZ-2214:
-

+1 on either the .2 or .3 patch btw.

> FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
> memToDiskMerging
> --
>
> Key: TEZ-2214
> URL: https://issues.apache.org/jira/browse/TEZ-2214
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2214.1.patch, TEZ-2214.2.patch, TEZ-2214.3.patch
>
>
> Scenario:
> - commitMemory & usedMemory are beyond their allowed threshold.
> - InMemoryMerge kicks off and is in the process of flushing memory contents 
> to disk
> - As it progresses, it releases memory segments as well (but not yet over).
> - Fetchers who need memory < maxSingleShuffleLimit, get scheduled.
> - If fetchers are fast, this quickly adds up to commitMemory & usedMemory. 
> Since InMemoryMerge is already in progress, this wouldn't trigger another 
> merge().
> - Pretty soon all fetchers would be stalled and get into the following state.
> {noformat}
> Thread 9351: (state = BLOCKED)
>  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
> imprecise)
>  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
>  @bci=17, line=337 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
>  @bci=34, line=157 (Interpreted frame)
> {noformat}
> - Even if InMemoryMerger completes, "commitedMem & usedMem" are beyond their 
> threshold and no other fetcher threads (all are in stalled state) are there 
> to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

2015-03-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380829#comment-14380829
 ] 

Hadoop QA commented on TEZ-2217:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707314/TEZ-2217.3.patch
  against master revision d1b4bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.rm.TestContainerReuse

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/348//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/348//console

This message is automatically generated.

> The min-held-containers constraint is not enforced during query runtime 
> 
>
> Key: TEZ-2217
> URL: https://issues.apache.org/jira/browse/TEZ-2217
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Gopal V
>Assignee: Bikas Saha
> Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
> TEZ-2217.2.patch, TEZ-2217.3.patch, TEZ-2217.txt.bz2
>
>
> The min-held containers constraint is respected during query idle times, but 
> is not respected when a query is actually in motion.
> The AM releases unused containers during dag execution without checking for 
> min-held containers.
> {code}
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
> container, containerId=container_1424502260528_1348_01_13, 
> containerExpiryTime=1426891313264, idleTimeoutMin=5000
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Releasing unused container: 
> container_1424502260528_1348_01_13
> {code}
> This is actually useful only after the AM has received a soft pre-emption 
> message, doing it on an idle cluster slows down one of the most common query 
> patterns in BI systems.
> {code}
> create temporary table smalltable as ...; 
> select ... bigtable JOIN smalltable ON ...;
> {code}
> The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2217 PreCommit Build #348

2015-03-25 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2217
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/348/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2717 lines...]




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707314/TEZ-2217.3.patch
  against master revision d1b4bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.rm.TestContainerReuse

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/348//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/348//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
512fae301497059df6a11d72ea59315461b071e1 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #343
Archived 44 artifacts
Archive block size is 32768
Received 2 blocks and 2653407 bytes
Compression is 2.4%
Took 1 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
REGRESSION:  org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse

Error Message:

Wanted but not invoked:
aMRMClientAsyncForTest.releaseAssignedContainer(
container_1_0001_01_02
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:488)

However, there were other interactions with this mock:
aMRMClientAsyncForTest.init(
Configuration: core-default.xml, core-site.xml, yarn-default.xml, 
yarn-site.xml
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.setConfig(
Configuration: core-default.xml, core-site.xml, yarn-default.xml, 
yarn-site.xml
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.serviceInit(
Configuration: core-default.xml, core-site.xml, yarn-default.xml, 
yarn-site.xml
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.setHeartbeatInterval(
1000
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.start();
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.serviceStart();
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.registerApplicationMaster(
"host",
0,
""
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:396)

aMRMClientAsyncForTest.addContainerRequest(
Capability[]Priority[1]
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:433)

aMRMClientAsyncForTest.addContainerRequest(
Capability[]Priority[1]
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:434)

aMRMClientAsyncForTest.addContainerRequest(
Capability[]Priority[1]
);
-> at 
org.apache.tez.dag.app.rm.TestContainerReuse.testSimpleReuse(TestContainerReuse.java:435)

aMRMClientAsyncForTest.addContainerRequest(
Capability[]Priority[1]
);
-> at 
org.apache.tez.dag.app.rm.TestCon

[jira] [Created] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-03-25 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-2234:
---

 Summary: Allow vertex managers to get output size per source vertex
 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha


Vertex managers may need per source vertex output stats to make reconfiguration 
decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2233) setParallelism should allow setting built-in edge managers

2015-03-25 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-2233:
---

 Summary: setParallelism should allow setting built-in edge managers
 Key: TEZ-2233
 URL: https://issues.apache.org/jira/browse/TEZ-2233
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha


Currently, all edge managers set during setParallelism end up becoming custom 
edges. However, just like during dag creation, it should be possible to specify 
standard edge types like scatter_gather if that is what the final user decision 
is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2196) Consider reusing UnorderedPartitionedKVWriter with single output in UnorderedKVOutput

2015-03-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380752#comment-14380752
 ] 

Siddharth Seth commented on TEZ-2196:
-

Looks good.
Minor: getInitialMemoryRequirement - this should return 0 when pipelining is 
disabled, and numPartitions=1 since buffers aren't being used. Otherwise we 
unnecessarily penalize other Inputs / Outputs which may exist on the vertex.

> Consider reusing UnorderedPartitionedKVWriter with single output in 
> UnorderedKVOutput
> -
>
> Key: TEZ-2196
> URL: https://issues.apache.org/jira/browse/TEZ-2196
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2196.1.patch, TEZ-2196.2.patch, TEZ-2196.3.patch
>
>
> Can possibly get rid of FileBasedKVWriter and reuse 
> UnorderedPartitionedKVWriter with single partition in UnorderedKVOutput.  
> This can also benefit from pipelined shuffle changes done in 
> UnorderedPartitionedKVWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2232) Allow setParallelism to be called multiple times before tasks get scheduled

2015-03-25 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-2232:
---

 Summary: Allow setParallelism to be called multiple times before 
tasks get scheduled
 Key: TEZ-2232
 URL: https://issues.apache.org/jira/browse/TEZ-2232
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha


Currently, this is allowed only once currently. It is harder to support this 
after the vertex tasks have already started running. But allowing it before 
tasks start running is actually trivial. This just allows VertexManagers to 
change their minds multiple times before they start the vertex processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

2015-03-25 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2217:

Attachment: TEZ-2217.3.patch

> The min-held-containers constraint is not enforced during query runtime 
> 
>
> Key: TEZ-2217
> URL: https://issues.apache.org/jira/browse/TEZ-2217
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Gopal V
>Assignee: Bikas Saha
> Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
> TEZ-2217.2.patch, TEZ-2217.3.patch, TEZ-2217.txt.bz2
>
>
> The min-held containers constraint is respected during query idle times, but 
> is not respected when a query is actually in motion.
> The AM releases unused containers during dag execution without checking for 
> min-held containers.
> {code}
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
> container, containerId=container_1424502260528_1348_01_13, 
> containerExpiryTime=1426891313264, idleTimeoutMin=5000
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Releasing unused container: 
> container_1424502260528_1348_01_13
> {code}
> This is actually useful only after the AM has received a soft pre-emption 
> message, doing it on an idle cluster slows down one of the most common query 
> patterns in BI systems.
> {code}
> create temporary table smalltable as ...; 
> select ... bigtable JOIN smalltable ON ...;
> {code}
> The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2231) Create project by-laws

2015-03-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2231:
-
Attachment: (was: by-laws.patch.2)

> Create project by-laws
> --
>
> Key: TEZ-2231
> URL: https://issues.apache.org/jira/browse/TEZ-2231
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: by-laws.2.patch, by-laws.patch
>
>
> Define the Project by-laws.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2231) Create project by-laws

2015-03-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2231:
-
Attachment: by-laws.2.patch

> Create project by-laws
> --
>
> Key: TEZ-2231
> URL: https://issues.apache.org/jira/browse/TEZ-2231
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: by-laws.2.patch, by-laws.patch
>
>
> Define the Project by-laws.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2231) Create project by-laws

2015-03-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2231:
-
Attachment: by-laws.patch.2

Minor modification to lazy approval to clarify that a minimum of 1 +1 vote is 
needed.

> Create project by-laws
> --
>
> Key: TEZ-2231
> URL: https://issues.apache.org/jira/browse/TEZ-2231
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: by-laws.patch, by-laws.patch.2
>
>
> Define the Project by-laws.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

2015-03-25 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2217:

Attachment: (was: TEZ-2217.3.patch)

> The min-held-containers constraint is not enforced during query runtime 
> 
>
> Key: TEZ-2217
> URL: https://issues.apache.org/jira/browse/TEZ-2217
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Gopal V
>Assignee: Bikas Saha
> Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
> TEZ-2217.2.patch, TEZ-2217.txt.bz2
>
>
> The min-held containers constraint is respected during query idle times, but 
> is not respected when a query is actually in motion.
> The AM releases unused containers during dag execution without checking for 
> min-held containers.
> {code}
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
> container, containerId=container_1424502260528_1348_01_13, 
> containerExpiryTime=1426891313264, idleTimeoutMin=5000
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Releasing unused container: 
> container_1424502260528_1348_01_13
> {code}
> This is actually useful only after the AM has received a soft pre-emption 
> message, doing it on an idle cluster slows down one of the most common query 
> patterns in BI systems.
> {code}
> create temporary table smalltable as ...; 
> select ... bigtable JOIN smalltable ON ...;
> {code}
> The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

2015-03-25 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2217:

Attachment: TEZ-2217.3.patch

> The min-held-containers constraint is not enforced during query runtime 
> 
>
> Key: TEZ-2217
> URL: https://issues.apache.org/jira/browse/TEZ-2217
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Gopal V
>Assignee: Bikas Saha
> Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
> TEZ-2217.2.patch, TEZ-2217.3.patch, TEZ-2217.txt.bz2
>
>
> The min-held containers constraint is respected during query idle times, but 
> is not respected when a query is actually in motion.
> The AM releases unused containers during dag execution without checking for 
> min-held containers.
> {code}
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
> container, containerId=container_1424502260528_1348_01_13, 
> containerExpiryTime=1426891313264, idleTimeoutMin=5000
> 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
> rm.YarnTaskSchedulerService: Releasing unused container: 
> container_1424502260528_1348_01_13
> {code}
> This is actually useful only after the AM has received a soft pre-emption 
> message, doing it on an idle cluster slows down one of the most common query 
> patterns in BI systems.
> {code}
> create temporary table smalltable as ...; 
> select ... bigtable JOIN smalltable ON ...;
> {code}
> The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2231) Create project by-laws

2015-03-25 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2231:


 Summary: Create project by-laws
 Key: TEZ-2231
 URL: https://issues.apache.org/jira/browse/TEZ-2231
 Project: Apache Tez
  Issue Type: Task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: by-laws.patch

Define the Project by-laws.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2231) Create project by-laws

2015-03-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2231:
-
Attachment: by-laws.patch

> Create project by-laws
> --
>
> Key: TEZ-2231
> URL: https://issues.apache.org/jira/browse/TEZ-2231
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: by-laws.patch
>
>
> Define the Project by-laws.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging

2015-03-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380398#comment-14380398
 ] 

Siddharth Seth commented on TEZ-2214:
-

I think both - the .2 and .3 patch - are good. As long as there's no other 
entity which is reserving memory. i.e. the MemToMemMerger may just become a 
little more complicated, or if we ever support data via events.
A fetcher will always trigger the MemToDiskMerger - and then go and wait on 
waitForInMemoryMerge, followed by waitForShuffleToMergeMemory. If the data 
fetched by this Fetcher triggered a merge - it'll always wait and re-check to 
see if another merge is required. If the data fetched did not trigger a merge 
(and a merge wasn't in progress) - memory limits haven't been hit, and a future 
fetch would trigger this.

> FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
> memToDiskMerging
> --
>
> Key: TEZ-2214
> URL: https://issues.apache.org/jira/browse/TEZ-2214
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2214.1.patch, TEZ-2214.2.patch, TEZ-2214.3.patch
>
>
> Scenario:
> - commitMemory & usedMemory are beyond their allowed threshold.
> - InMemoryMerge kicks off and is in the process of flushing memory contents 
> to disk
> - As it progresses, it releases memory segments as well (but not yet over).
> - Fetchers who need memory < maxSingleShuffleLimit, get scheduled.
> - If fetchers are fast, this quickly adds up to commitMemory & usedMemory. 
> Since InMemoryMerge is already in progress, this wouldn't trigger another 
> merge().
> - Pretty soon all fetchers would be stalled and get into the following state.
> {noformat}
> Thread 9351: (state = BLOCKED)
>  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
> imprecise)
>  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
>  @bci=17, line=337 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
>  @bci=34, line=157 (Interpreted frame)
> {noformat}
> - Even if InMemoryMerger completes, "commitedMem & usedMem" are beyond their 
> threshold and no other fetcher threads (all are in stalled state) are there 
> to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2103) Implement a Partial completion VertexManagerPlugin

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380376#comment-14380376
 ] 

Hitesh Shah edited comment on TEZ-2103 at 3/25/15 6:06 PM:
---

Hi [~alokasok], sorry for the late reply. 

The use-case is somewhat along these lines.

Assume that there is a SQL query being run to do something like "select * from 
table where  limit 10;". This can be broken down into a single 
stage DAG where each task runs on a partition of data and completes as soon as 
it has say 10 records that match the filter clauses. 

Now, if the overall data is huge i.e. multiple TBs and needs 100,000 tasks to 
scan all the data, running 100,000 tasks should not be necessary if the first 
task to complete returned 10 results. 

VertexManagers are sort of the vertex controllers. They control scheduling of 
tasks ( when to start running tasks, whether to enable slow start i.e. wait for 
upstream data to be ready before starting tasks ), locality management, vertex 
parallelism, etc. The VertexManagerPlugin is a user-provided plugin to modify 
the above behavior.

If you have read up on Tez and take a more detailed look into the VertexManager 
code, the VertexManager could be made a bit more powerful to be able to trigger 
completion of a vertex sooner without needing to run all the tasks if certain 
conditions get matched earlier. The changes will likely also encompass the 
Vertex(Impl) state machine in terms of how you treat short-circuited tasks so 
that all the correct bookkeeping is done from a state management point of view.





was (Author: hitesh):
Hi [~alokasok], sorry for the late reply. 

The use-case is somewhat along these lines.

Assume that there is a SQL query being run to do something like "select * from 
table where  limit 10;". This can be broken down into a single 
stage DAG where each task runs on a partition of data and completes as soon as 
it has say 10 records that match the filter clauses. 

Now, if the overall data is huge i.e. multiple TBs and needs 100,000 tasks to 
scan all the data, running 100,000 tasks should not be necessary if the first 
task to complete returned 10 results. 

VertexManagers are sort of the vertex controllers. They control scheduling of 
tasks ( when to start running tasks, whether to enable slow start i.e. wait for 
upstream data to be ready before starting tasks ), locality management, vertex 
parallelism. The VertexManagerPlugin is a user-provided plugin to modify the 
above behavior.

If you have read up on Tez and take a more detailed look into the VertexManager 
code, the VertexManager could be made a bit more powerful to be able to trigger 
completion of a vertex sooner without needing to run all the tasks if certain 
conditions get matched earlier. The changes will likely also encompass the 
Vertex(Impl) state machine in terms of how you treat short-circuited tasks so 
that all the correct bookkeeping is done from a state management point of view.




> Implement a Partial completion VertexManagerPlugin
> --
>
> Key: TEZ-2103
> URL: https://issues.apache.org/jira/browse/TEZ-2103
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Gopal V
>  Labels: gsoc, gsoc2015, hadoop, java, tez
>
> Currently, there is no sibling communication between tasks - this implies 
> that a task can be completed by the first vertex in a wave of tasks, but the 
> entire wave of tasks has to complete before success can be reported.
> This occurs in limit + filter query patterns common between the data access 
> engines.
> {code}
> select * from data where x > 1 limit 10;
> {code}
> will run through a full-table scan worth of tasks to generate 10 rows per 
> task, to aggregate it to produce the final 10 row result.
> The VertexManager receives counters/events early enough to short-circuit the 
> rest of the vertex tasks, to prevent the remainder of tasks from getting 
> scheduled when the limit condition has been satisfied by an initial sub-set 
> of the tasks.
> This is a specialization of the VertexManagerPlugin for this common case 
> scheduling pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2103) Implement a Partial completion VertexManagerPlugin

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380376#comment-14380376
 ] 

Hitesh Shah commented on TEZ-2103:
--

Hi [~alokasok], sorry for the late reply. 

The use-case is somewhat along these lines.

Assume that there is a SQL query being run to do something like "select * from 
table where  limit 10;". This can be broken down into a single 
stage DAG where each task runs on a partition of data and completes as soon as 
it has say 10 records that match the filter clauses. 

Now, if the overall data is huge i.e. multiple TBs and needs 100,000 tasks to 
scan all the data, running 100,000 tasks should not be necessary if the first 
task to complete returned 10 results. 

VertexManagers are sort of the vertex controllers. They control scheduling of 
tasks ( when to start running tasks, whether to enable slow start i.e. wait for 
upstream data to be ready before starting tasks ), locality management, vertex 
parallelism. The VertexManagerPlugin is a user-provided plugin to modify the 
above behavior.

If you have read up on Tez and take a more detailed look into the VertexManager 
code, the VertexManager could be made a bit more powerful to be able to trigger 
completion of a vertex sooner without needing to run all the tasks if certain 
conditions get matched earlier. The changes will likely also encompass the 
Vertex(Impl) state machine in terms of how you treat short-circuited tasks so 
that all the correct bookkeeping is done from a state management point of view.




> Implement a Partial completion VertexManagerPlugin
> --
>
> Key: TEZ-2103
> URL: https://issues.apache.org/jira/browse/TEZ-2103
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Gopal V
>  Labels: gsoc, gsoc2015, hadoop, java, tez
>
> Currently, there is no sibling communication between tasks - this implies 
> that a task can be completed by the first vertex in a wave of tasks, but the 
> entire wave of tasks has to complete before success can be reported.
> This occurs in limit + filter query patterns common between the data access 
> engines.
> {code}
> select * from data where x > 1 limit 10;
> {code}
> will run through a full-table scan worth of tasks to generate 10 rows per 
> task, to aggregate it to produce the final 10 row result.
> The VertexManager receives counters/events early enough to short-circuit the 
> rest of the vertex tasks, to prevent the remainder of tasks from getting 
> scheduled when the limit condition has been satisfied by an initial sub-set 
> of the tasks.
> This is a specialization of the VertexManagerPlugin for this common case 
> scheduling pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380270#comment-14380270
 ] 

Hitesh Shah edited comment on TEZ-2205 at 3/25/15 5:57 PM:
---

Adding a test for this would be useful both the acl policy manager and the 
logging service. 


was (Author: hitesh):
Adding a test for this would be useful. 

> Tez still tries to post to ATS when yarn.timeline-service.enabled=false
> ---
>
> Key: TEZ-2205
> URL: https://issues.apache.org/jira/browse/TEZ-2205
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.1
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: TEZ-2205.1.patch, TEZ-2205.wip.patch
>
>
> when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, 
> but hits error as token is not found. Does not fail the job because of the 
> fix to not fail job when there is error posting to ATS. But it should not be 
> trying to post to ATS in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2230) Speculative attempt should not have the original attempts machine in its preferred locations

2015-03-25 Thread Bikas Saha (JIRA)
Bikas Saha created TEZ-2230:
---

 Summary: Speculative attempt should not have the original attempts 
machine in its preferred locations
 Key: TEZ-2230
 URL: https://issues.apache.org/jira/browse/TEZ-2230
 Project: Apache Tez
  Issue Type: Sub-task
Affects Versions: 0.6.0
Reporter: Bikas Saha
Assignee: Bikas Saha






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380270#comment-14380270
 ] 

Hitesh Shah commented on TEZ-2205:
--

Adding a test for this would be useful. 

> Tez still tries to post to ATS when yarn.timeline-service.enabled=false
> ---
>
> Key: TEZ-2205
> URL: https://issues.apache.org/jira/browse/TEZ-2205
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.1
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: TEZ-2205.1.patch, TEZ-2205.wip.patch
>
>
> when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, 
> but hits error as token is not found. Does not fail the job because of the 
> fix to not fail job when there is error posting to ATS. But it should not be 
> trying to post to ATS in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380267#comment-14380267
 ] 

Hitesh Shah commented on TEZ-2205:
--

{code}
LOG.warn("Timeline service is not enabled");
{code}
  - the log should be more clear about the ATSLogging service/acl manager being 
disabled as the timeline service being disabled

The changes in ATSHistoryLoggingService could be more optimal. If it is 
disabled, why even bother queueing up events? 

> Tez still tries to post to ATS when yarn.timeline-service.enabled=false
> ---
>
> Key: TEZ-2205
> URL: https://issues.apache.org/jira/browse/TEZ-2205
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.1
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: TEZ-2205.1.patch, TEZ-2205.wip.patch
>
>
> when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, 
> but hits error as token is not found. Does not fail the job because of the 
> fix to not fail job when there is error posting to ATS. But it should not be 
> trying to post to ATS in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false

2015-03-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380268#comment-14380268
 ] 

Hadoop QA commented on TEZ-2205:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707237/TEZ-2205.1.patch
  against master revision d1b4bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/347//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/347//console

This message is automatically generated.

> Tez still tries to post to ATS when yarn.timeline-service.enabled=false
> ---
>
> Key: TEZ-2205
> URL: https://issues.apache.org/jira/browse/TEZ-2205
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.1
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: TEZ-2205.1.patch, TEZ-2205.wip.patch
>
>
> when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, 
> but hits error as token is not found. Does not fail the job because of the 
> fix to not fail job when there is error posting to ATS. But it should not be 
> trying to post to ATS in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2205 PreCommit Build #347

2015-03-25 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2205
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/347/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2752 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707237/TEZ-2205.1.patch
  against master revision d1b4bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/347//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/347//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
cb01236764c0e9affe953e1793bb1fec9ce79c15 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #343
Archived 44 artifacts
Archive block size is 32768
Received 4 blocks and 2597651 bytes
Compression is 4.8%
Took 1.3 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380255#comment-14380255
 ] 

Hitesh Shah commented on TEZ-2214:
--

Question for patch 3 with respect to waitForInMemoryMerge(). Does this need to 
have only 2 runs of the merger? Or should this be a loop? Will there ever be a 
case where the same situation comes about when the second merge is in progress?

> FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
> memToDiskMerging
> --
>
> Key: TEZ-2214
> URL: https://issues.apache.org/jira/browse/TEZ-2214
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2214.1.patch, TEZ-2214.2.patch, TEZ-2214.3.patch
>
>
> Scenario:
> - commitMemory & usedMemory are beyond their allowed threshold.
> - InMemoryMerge kicks off and is in the process of flushing memory contents 
> to disk
> - As it progresses, it releases memory segments as well (but not yet over).
> - Fetchers who need memory < maxSingleShuffleLimit, get scheduled.
> - If fetchers are fast, this quickly adds up to commitMemory & usedMemory. 
> Since InMemoryMerge is already in progress, this wouldn't trigger another 
> merge().
> - Pretty soon all fetchers would be stalled and get into the following state.
> {noformat}
> Thread 9351: (state = BLOCKED)
>  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
> imprecise)
>  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
>  @bci=17, line=337 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
>  @bci=34, line=157 (Interpreted frame)
> {noformat}
> - Even if InMemoryMerger completes, "commitedMem & usedMem" are beyond their 
> threshold and no other fetcher threads (all are in stalled state) are there 
> to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false

2015-03-25 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated TEZ-2205:
--
Attachment: TEZ-2205.1.patch

> Tez still tries to post to ATS when yarn.timeline-service.enabled=false
> ---
>
> Key: TEZ-2205
> URL: https://issues.apache.org/jira/browse/TEZ-2205
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.1
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: TEZ-2205.1.patch, TEZ-2205.wip.patch
>
>
> when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, 
> but hits error as token is not found. Does not fail the job because of the 
> fix to not fail job when there is error posting to ATS. But it should not be 
> trying to post to ATS in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2227) Tez UI shows empty page under IE11

2015-03-25 Thread Prakash Ramachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380143#comment-14380143
 ] 

Prakash Ramachandran commented on TEZ-2227:
---

will look into IE support issues.

> Tez UI shows empty page under IE11
> --
>
> Key: TEZ-2227
> URL: https://issues.apache.org/jira/browse/TEZ-2227
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 0.6.0
>Reporter: Fengdong Yu
>Assignee: Prakash Ramachandran
>Priority: Minor
> Attachments: IE11.PNG, chrome.PNG
>
>
> Tez UI works well under Chrome and Firefox, but shows empty page udner IE11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2227) Tez UI shows empty page under IE11

2015-03-25 Thread Prakash Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Ramachandran reassigned TEZ-2227:
-

Assignee: Prakash Ramachandran

> Tez UI shows empty page under IE11
> --
>
> Key: TEZ-2227
> URL: https://issues.apache.org/jira/browse/TEZ-2227
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 0.6.0
>Reporter: Fengdong Yu
>Assignee: Prakash Ramachandran
>Priority: Minor
> Attachments: IE11.PNG, chrome.PNG
>
>
> Tez UI works well under Chrome and Firefox, but shows empty page udner IE11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2229) bower ESUDO Cannot be run with sudo -- during build

2015-03-25 Thread Prakash Ramachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380141#comment-14380141
 ] 

Prakash Ramachandran commented on TEZ-2229:
---

The allow-root was removed as it is generally not recommended like in the error 
message shown (mixing sudo and then running without root often causes 
permission issues etc.). 

it would be required if the build is done as root. 

> bower ESUDO Cannot be run with sudo -- during build
> ---
>
> Key: TEZ-2229
> URL: https://issues.apache.org/jira/browse/TEZ-2229
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Linux x86_64 
>Reporter: Fengdong Yu
>
> I build Tez using root, I never install node/npm locally before my build.
> then there are exception messages during build tez-ui module. Maven debug 
> logs:
> {code}
> [DEBUG] env: SSH_TTY=/dev/pts/0
> [DEBUG] env: TERM=xterm
> [DEBUG] env: USER=root
> [DEBUG] env: XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
> [DEBUG] Toolchains are ignored, 'executable' parameter is set to 
> /root/temp/apache-tez-0.6.0-src/tez-ui/src/main/webapp/node/node
> [DEBUG] Executing command line: 
> [/root/temp/apache-tez-0.6.0-src/tez-ui/src/main/webapp/node/node, 
> node_modules/bower/bin/bower, install, --remove-unnecessary-resolutions=false]
> bower ESUDO Cannot be run with sudo
> Additional error details:
> Since bower is a user command, there is no need to execute it with superuser 
> permissions.
> If you're having permission errors when using bower without sudo, please 
> spend a few minutes learning more about how your system should work and make 
> any necessary repairs.
> http://www.joyent.com/blog/installing-node-and-npm
> https://gist.github.com/isaacs/579814
> You can however run a command with sudo using --allow-root option
> {code}
> {code}
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec 
> (Bower install) on project tez-ui: Command execution failed. Process exited 
> with an error: 1 (Exit value: 1) -> [
> Help 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
> execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec (Bower install) 
> on project tez-ui: Command execution failed.
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:216)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: Command execution 
> failed.
>   at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:303)
>   at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
>   ... 19 more
> Caused by: org.apache.commons.exec.ExecuteException: Process exited with an 
> error: 1 (Exit value: 1)
>   at 
> org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:402)
>   at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:164)
>   at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:746)
>   a

[jira] [Commented] (TEZ-2047) Build fails against hadoop-2.2 post TEZ-2018

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380052#comment-14380052
 ] 

Hitesh Shah commented on TEZ-2047:
--

Updated missing fix version.

> Build fails against hadoop-2.2 post TEZ-2018
> 
>
> Key: TEZ-2047
> URL: https://issues.apache.org/jira/browse/TEZ-2047
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Prakash Ramachandran
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: TEZ-2047.1.patch, TEZ-2047.1.patch, TEZ-2047.2.patch
>
>
> Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project tez-dag: Compilation failure: Compilation failure:
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[85,13]
>  cannot find symbol
> [ERROR] symbol  : method 
> withHttpPolicy(org.apache.hadoop.conf.Configuration,org.apache.hadoop.http.HttpConfig.Policy)
> [ERROR] location: class 
> org.apache.hadoop.yarn.webapp.WebApps.Builder
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[87,45]
>  cannot find symbol
> [ERROR] symbol  : method getConnectorAddress(int)
> [ERROR] location: class org.apache.hadoop.http.HttpServer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2047) Build fails against hadoop-2.2 post TEZ-2018

2015-03-25 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2047:
-
Fix Version/s: 0.6.1

> Build fails against hadoop-2.2 post TEZ-2018
> 
>
> Key: TEZ-2047
> URL: https://issues.apache.org/jira/browse/TEZ-2047
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Prakash Ramachandran
>Priority: Blocker
> Fix For: 0.6.1
>
> Attachments: TEZ-2047.1.patch, TEZ-2047.1.patch, TEZ-2047.2.patch
>
>
> Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project tez-dag: Compilation failure: Compilation failure:
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[85,13]
>  cannot find symbol
> [ERROR] symbol  : method 
> withHttpPolicy(org.apache.hadoop.conf.Configuration,org.apache.hadoop.http.HttpConfig.Policy)
> [ERROR] location: class 
> org.apache.hadoop.yarn.webapp.WebApps.Builder
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[87,45]
>  cannot find symbol
> [ERROR] symbol  : method getConnectorAddress(int)
> [ERROR] location: class org.apache.hadoop.http.HttpServer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2227) Tez UI shows empty page under IE11

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380050#comment-14380050
 ] 

Hitesh Shah commented on TEZ-2227:
--

\cc [~Sreenath] [~pramachandran] in case they have seen this before

> Tez UI shows empty page under IE11
> --
>
> Key: TEZ-2227
> URL: https://issues.apache.org/jira/browse/TEZ-2227
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 0.6.0
>Reporter: Fengdong Yu
>Priority: Minor
> Attachments: IE11.PNG, chrome.PNG
>
>
> Tez UI works well under Chrome and Firefox, but shows empty page udner IE11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2229) bower ESUDO Cannot be run with sudo -- during build

2015-03-25 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380047#comment-14380047
 ] 

Hitesh Shah commented on TEZ-2229:
--

Adding a link to TEZ-1838 where allow-root was removed.

\cc [~pramachandran] [~Sreenath] any comments on this?

> bower ESUDO Cannot be run with sudo -- during build
> ---
>
> Key: TEZ-2229
> URL: https://issues.apache.org/jira/browse/TEZ-2229
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Linux x86_64 
>Reporter: Fengdong Yu
>
> I build Tez using root, I never install node/npm locally before my build.
> then there are exception messages during build tez-ui module. Maven debug 
> logs:
> {code}
> [DEBUG] env: SSH_TTY=/dev/pts/0
> [DEBUG] env: TERM=xterm
> [DEBUG] env: USER=root
> [DEBUG] env: XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
> [DEBUG] Toolchains are ignored, 'executable' parameter is set to 
> /root/temp/apache-tez-0.6.0-src/tez-ui/src/main/webapp/node/node
> [DEBUG] Executing command line: 
> [/root/temp/apache-tez-0.6.0-src/tez-ui/src/main/webapp/node/node, 
> node_modules/bower/bin/bower, install, --remove-unnecessary-resolutions=false]
> bower ESUDO Cannot be run with sudo
> Additional error details:
> Since bower is a user command, there is no need to execute it with superuser 
> permissions.
> If you're having permission errors when using bower without sudo, please 
> spend a few minutes learning more about how your system should work and make 
> any necessary repairs.
> http://www.joyent.com/blog/installing-node-and-npm
> https://gist.github.com/isaacs/579814
> You can however run a command with sudo using --allow-root option
> {code}
> {code}
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec 
> (Bower install) on project tez-ui: Command execution failed. Process exited 
> with an error: 1 (Exit value: 1) -> [
> Help 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
> execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec (Bower install) 
> on project tez-ui: Command execution failed.
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:216)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>   at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>   at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
>   at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
>   at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
>   at org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
>   at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
>   at org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
>   at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: Command execution 
> failed.
>   at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:303)
>   at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
>   at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
>   ... 19 more
> Caused by: org.apache.commons.exec.ExecuteException: Process exited with an 
> error: 1 (Exit value: 1)
>   at 
> org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:402)
>   at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:164)
>   at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:746)
>   at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:292)
>   ... 21 more
> [ERROR] 
> [ERROR] 
> [ERROR] For more information abo

[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380043#comment-14380043
 ] 

Hadoop QA commented on TEZ-714:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707217/TEZ-714-4.patch
  against master revision d1b4bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestCommit

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/346//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/346//console

This message is automatically generated.

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, TEZ-714-4.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-714 PreCommit Build #346

2015-03-25 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-714
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/346/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2381 lines...]




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707217/TEZ-714-4.patch
  against master revision d1b4bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestCommit

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/346//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/346//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
d097ae729e94fba50749fa203ae0f8b6d76f0a90 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #343
Archived 44 artifacts
Archive block size is 32768
Received 2 blocks and 2638094 bytes
Compression is 2.4%
Took 1 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.tez.dag.app.dag.impl.TestCommit.testVertexGroupCommitFinishedEventFail

Error Message:
expected:<0> but was:<1>

Stack Trace:
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.tez.dag.app.dag.impl.TestCommit.testVertexGroupCommitFinishedEventFail(TestCommit.java:1194)




[jira] [Comment Edited] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379904#comment-14379904
 ] 

Jeff Zhang edited comment on TEZ-714 at 3/25/15 3:04 PM:
-

Upload new patch to fix the test failed issue.


was (Author: zjffdu):
test failed, will check it

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, TEZ-714-4.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-714:
---
Attachment: TEZ-714-4.patch

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, TEZ-714-4.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false

2015-03-25 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379951#comment-14379951
 ] 

Chang Li commented on TEZ-2205:
---

Thanks a lot for patient discussion and explanations [~hitesh], [~jeagles], 
[~zjshen]! Understand the implementation requirement now, will provide a patch 
soon

> Tez still tries to post to ATS when yarn.timeline-service.enabled=false
> ---
>
> Key: TEZ-2205
> URL: https://issues.apache.org/jira/browse/TEZ-2205
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.1
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: TEZ-2205.wip.patch
>
>
> when set yarn.timeline-service.enabled=false, Tez still tries posting to ATS, 
> but hits error as token is not found. Does not fail the job because of the 
> fix to not fail job when there is error posting to ATS. But it should not be 
> trying to post to ATS in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379904#comment-14379904
 ] 

Jeff Zhang commented on TEZ-714:


test failed, will check it

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379901#comment-14379901
 ] 

Hadoop QA commented on TEZ-714:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707199/TEZ-714-3.patch
  against master revision 60ddcba.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestTezJobs
  org.apache.tez.mapreduce.TestMRRJobsDAGApi

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/345//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/345//console

This message is automatically generated.

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-714 PreCommit Build #345

2015-03-25 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-714
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/345/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2532 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12707199/TEZ-714-3.patch
  against master revision 60ddcba.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestTezJobs
  org.apache.tez.mapreduce.TestMRRJobsDAGApi

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/345//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/345//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
ebe71b2173110526859381e9bdcee0ec01d52f37 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #343
Archived 44 artifacts
Archive block size is 32768
Received 4 blocks and 2585916 bytes
Compression is 4.8%
Took 9.6 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
REGRESSION:  org.apache.tez.mapreduce.TestMRRJobsDAGApi.testVertexGroups

Error Message:
Unsupported value for DAGStatus.State : DAG_COMMITTING

Stack Trace:
org.apache.tez.dag.api.TezUncheckedException: Unsupported value for 
DAGStatus.State : DAG_COMMITTING
at org.apache.tez.dag.api.client.DAGStatus.getState(DAGStatus.java:83)
at 
org.apache.tez.dag.api.client.DAGStatus.isCompleted(DAGStatus.java:89)
at 
org.apache.tez.dag.api.client.DAGClientImpl._waitForCompletionWithStatusUpdates(DAGClientImpl.java:436)
at 
org.apache.tez.dag.api.client.DAGClientImpl.waitForCompletionWithStatusUpdates(DAGClientImpl.java:298)
at 
org.apache.tez.mapreduce.examples.UnionExample.run(UnionExample.java:280)
at 
org.apache.tez.mapreduce.TestMRRJobsDAGApi.testVertexGroups(TestMRRJobsDAGApi.java:850)


REGRESSION:  org.apache.tez.test.TestTezJobs.testHashJoinExamplePipeline

Error Message:
Unsupported value for DAGStatus.State : DAG_COMMITTING

Stack Trace:
org.apache.tez.dag.api.TezUncheckedException: Unsupported value for 
DAGStatus.State : DAG_COMMITTING
at org.apache.tez.dag.api.client.DAGStatus.getState(DAGStatus.java:83)
at 
org.apache.tez.dag.api.client.DAGStatus.isCompleted(DAGStatus.java:89)
at 
org.apache.tez.dag.api.client.DAGClientImpl._waitForCompletionWithStatusUpdates(DAGClientImpl.java:436)
at 
org.apache.tez.dag.api.client.DAGClientImpl.waitForCompletionWithStatusUpdates(DAGClientImpl.java:298)
at 
org.apache.tez.examples.TezExampleBase.runDag(TezExampleBase.java:134)
at 
org.apache.tez.examples.HashJoinExample.runJob(HashJoinExample.java:130)
at 
org.apache.tez.examples.TezExampleBase._execute(TezExampleBase.java:179)
at org.apache.tez.examples.TezExampleBase.run(TezExampleBase.java:112)
at 
org.apache.tez.test.TestTezJobs.testHashJoinExamplePipeline(TestTezJobs.java:404)




[jira] [Resolved] (TEZ-2047) Build fails against hadoop-2.2 post TEZ-2018

2015-03-25 Thread Prakash Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Ramachandran resolved TEZ-2047.
---
Resolution: Fixed

Thanks hitesh committed to master, branch-0.6

> Build fails against hadoop-2.2 post TEZ-2018
> 
>
> Key: TEZ-2047
> URL: https://issues.apache.org/jira/browse/TEZ-2047
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Prakash Ramachandran
>Priority: Blocker
> Attachments: TEZ-2047.1.patch, TEZ-2047.1.patch, TEZ-2047.2.patch
>
>
> Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project tez-dag: Compilation failure: Compilation failure:
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[85,13]
>  cannot find symbol
> [ERROR] symbol  : method 
> withHttpPolicy(org.apache.hadoop.conf.Configuration,org.apache.hadoop.http.HttpConfig.Policy)
> [ERROR] location: class 
> org.apache.hadoop.yarn.webapp.WebApps.Builder
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[87,45]
>  cannot find symbol
> [ERROR] symbol  : method getConnectorAddress(int)
> [ERROR] location: class org.apache.hadoop.http.HttpServer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379887#comment-14379887
 ] 

Jeff Zhang commented on TEZ-714:


[~bikassaha] Thanks for the suggestion. Upload a patch (unit test is included, 
but e2e test in MockDAGAppMaster has not implemented yet. )

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-03-25 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-714:
---
Attachment: TEZ-714-3.patch

> OutputCommitters should not run in the main AM dispatcher thread
> 
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)