[jira] [Commented] (TEZ-1560) Invalid state machine transition in recovery

2015-04-03 Thread Carter Shanklin (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395421#comment-14395421
 ] 

Carter Shanklin commented on TEZ-1560:
--

Here's what I did:

* Using the Hortonworks Sandbox based on HDP 2.2.3.
* Kicked off a Hive query and waited for some mappers to start running
* Ran "tc qdisc add dev lo root netem loss 66%" This causes 66% packet loss on 
loopback so we can expect a lot of strange failures to start happening.
* Waited about 5 minutes
* Ran "tc qdisc delete dev lo root netem loss 66%" So now there is no packet 
loss
* After about a minute or so the job failed with below error:

{code}
Status: Failed
Invalid event V_INTERNAL_ERROR on Vertex vertex_1427920581283_0018_12_01
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask
{code}

> Invalid state machine transition in recovery
> 
>
> Key: TEZ-1560
> URL: https://issues.apache.org/jira/browse/TEZ-1560
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: failed_tez_job.txt.gz
>
>
> {code}
> 2014-09-04 16:08:25,504 INFO [main] org.apache.tez.dag.app.dag.impl.DAGImpl: 
> dag_1409818083015_0001_1 transitioned from NEW to RUNNING
> 2014-09-04 16:08:25,504 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Recovered Vertex State, 
> vertexId=vertex_1409818083015_0001_1_00 [v1], state=NEW, 
> numInitedSourceVertices=0, numStartedSourceVertices=0, 
> numRecoveredSourceVertices=0, recoveredEvents=0, tasksIsNull=false, numTasks=0
> 2014-09-04 16:08:25,505 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Root Inputs exist for Vertex: v1 
> : {Input={InputName=Input}, 
> {Descriptor=ClassName=org.apache.tez.test.dag.MultiAttemptDAG$NoOpInput, 
> hasPayload=false}, 
> {ControllerDescriptor=ClassName=org.apache.tez.test.dag.MultiAttemptDAG$TestRootInputInitializer,
>  hasPayload=false}}
> 2014-09-04 16:08:25,505 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Starting root input initializer 
> for input: Input, with class: 
> [org.apache.tez.test.dag.MultiAttemptDAG$TestRootInputInitializer]
> 2014-09-04 16:08:25,506 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Setting user vertex manager 
> plugin: 
> org.apache.tez.test.dag.MultiAttemptDAG$FailOnAttemptVertexManagerPlugin on 
> vertex: v1
> 2014-09-04 16:08:25,508 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Creating 2 for vertex: 
> vertex_1409818083015_0001_1_00 [v1]
> 2014-09-04 16:08:25,518 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Starting root input initializers: 
> 1
> 2014-09-04 16:08:25,520 INFO [InputInitializer [v1] #0] 
> org.apache.tez.dag.app.dag.RootInputInitializerManager: Starting 
> InputInitializer for Input: Input on vertex vertex_1409818083015_0001_1_00 
> [v1]
> 2014-09-04 16:08:25,522 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.RootInputInitializerManager: Succeeded 
> InputInitializer for Input: Input on vertex vertex_1409818083015_0001_1_00 
> [v1]
> 2014-09-04 16:08:25,523 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: vertex_1409818083015_0001_1_00 
> [v1] transitioned from NEW to INITIALIZING due to event V_INIT
> 2014-09-04 16:08:25,523 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Recovered Vertex State, 
> vertexId=vertex_1409818083015_0001_1_01 [v2], state=NEW, 
> numInitedSourceVertices0, numStartedSourceVertices=0, 
> numRecoveredSourceVertices=1, tasksIsNull=false, numTasks=0
> 2014-09-04 16:08:25,523 ERROR [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Can't handle Invalid event 
> V_SOURCE_VERTEX_RECOVERED on vertex v2 with vertexId 
> vertex_1409818083015_0001_1_01 at current state NEW
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> V_SOURCE_VERTEX_RECOVERED at NEW
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:388)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1344)
>   at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1)
>   at 
> org.apache.tez.dag.app.DAGAppMaster$Vert

[jira] [Updated] (TEZ-1560) Invalid state machine transition in recovery

2015-04-03 Thread Carter Shanklin (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carter Shanklin updated TEZ-1560:
-
Attachment: failed_tez_job.txt.gz

Logs of the failed job.

> Invalid state machine transition in recovery
> 
>
> Key: TEZ-1560
> URL: https://issues.apache.org/jira/browse/TEZ-1560
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Critical
> Attachments: failed_tez_job.txt.gz
>
>
> {code}
> 2014-09-04 16:08:25,504 INFO [main] org.apache.tez.dag.app.dag.impl.DAGImpl: 
> dag_1409818083015_0001_1 transitioned from NEW to RUNNING
> 2014-09-04 16:08:25,504 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Recovered Vertex State, 
> vertexId=vertex_1409818083015_0001_1_00 [v1], state=NEW, 
> numInitedSourceVertices=0, numStartedSourceVertices=0, 
> numRecoveredSourceVertices=0, recoveredEvents=0, tasksIsNull=false, numTasks=0
> 2014-09-04 16:08:25,505 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Root Inputs exist for Vertex: v1 
> : {Input={InputName=Input}, 
> {Descriptor=ClassName=org.apache.tez.test.dag.MultiAttemptDAG$NoOpInput, 
> hasPayload=false}, 
> {ControllerDescriptor=ClassName=org.apache.tez.test.dag.MultiAttemptDAG$TestRootInputInitializer,
>  hasPayload=false}}
> 2014-09-04 16:08:25,505 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Starting root input initializer 
> for input: Input, with class: 
> [org.apache.tez.test.dag.MultiAttemptDAG$TestRootInputInitializer]
> 2014-09-04 16:08:25,506 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Setting user vertex manager 
> plugin: 
> org.apache.tez.test.dag.MultiAttemptDAG$FailOnAttemptVertexManagerPlugin on 
> vertex: v1
> 2014-09-04 16:08:25,508 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Creating 2 for vertex: 
> vertex_1409818083015_0001_1_00 [v1]
> 2014-09-04 16:08:25,518 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Starting root input initializers: 
> 1
> 2014-09-04 16:08:25,520 INFO [InputInitializer [v1] #0] 
> org.apache.tez.dag.app.dag.RootInputInitializerManager: Starting 
> InputInitializer for Input: Input on vertex vertex_1409818083015_0001_1_00 
> [v1]
> 2014-09-04 16:08:25,522 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.RootInputInitializerManager: Succeeded 
> InputInitializer for Input: Input on vertex vertex_1409818083015_0001_1_00 
> [v1]
> 2014-09-04 16:08:25,523 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: vertex_1409818083015_0001_1_00 
> [v1] transitioned from NEW to INITIALIZING due to event V_INIT
> 2014-09-04 16:08:25,523 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Recovered Vertex State, 
> vertexId=vertex_1409818083015_0001_1_01 [v2], state=NEW, 
> numInitedSourceVertices0, numStartedSourceVertices=0, 
> numRecoveredSourceVertices=1, tasksIsNull=false, numTasks=0
> 2014-09-04 16:08:25,523 ERROR [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Can't handle Invalid event 
> V_SOURCE_VERTEX_RECOVERED on vertex v2 with vertexId 
> vertex_1409818083015_0001_1_01 at current state NEW
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> V_SOURCE_VERTEX_RECOVERED at NEW
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:388)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1344)
>   at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1)
>   at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1641)
>   at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>   at java.lang.Thread.run(Thread.java:745)
> 2014-09-04 16:08:25,524 FATAL [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1560) Invalid state machine transition in recovery

2015-04-03 Thread Carter Shanklin (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395398#comment-14395398
 ] 

Carter Shanklin commented on TEZ-1560:
--

I hit this too while simulating a network failure, Tez 0.5.2. Ping me offline 
for details if you want more.

> Invalid state machine transition in recovery
> 
>
> Key: TEZ-1560
> URL: https://issues.apache.org/jira/browse/TEZ-1560
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>Priority: Critical
>
> {code}
> 2014-09-04 16:08:25,504 INFO [main] org.apache.tez.dag.app.dag.impl.DAGImpl: 
> dag_1409818083015_0001_1 transitioned from NEW to RUNNING
> 2014-09-04 16:08:25,504 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Recovered Vertex State, 
> vertexId=vertex_1409818083015_0001_1_00 [v1], state=NEW, 
> numInitedSourceVertices=0, numStartedSourceVertices=0, 
> numRecoveredSourceVertices=0, recoveredEvents=0, tasksIsNull=false, numTasks=0
> 2014-09-04 16:08:25,505 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Root Inputs exist for Vertex: v1 
> : {Input={InputName=Input}, 
> {Descriptor=ClassName=org.apache.tez.test.dag.MultiAttemptDAG$NoOpInput, 
> hasPayload=false}, 
> {ControllerDescriptor=ClassName=org.apache.tez.test.dag.MultiAttemptDAG$TestRootInputInitializer,
>  hasPayload=false}}
> 2014-09-04 16:08:25,505 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Starting root input initializer 
> for input: Input, with class: 
> [org.apache.tez.test.dag.MultiAttemptDAG$TestRootInputInitializer]
> 2014-09-04 16:08:25,506 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Setting user vertex manager 
> plugin: 
> org.apache.tez.test.dag.MultiAttemptDAG$FailOnAttemptVertexManagerPlugin on 
> vertex: v1
> 2014-09-04 16:08:25,508 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Creating 2 for vertex: 
> vertex_1409818083015_0001_1_00 [v1]
> 2014-09-04 16:08:25,518 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Starting root input initializers: 
> 1
> 2014-09-04 16:08:25,520 INFO [InputInitializer [v1] #0] 
> org.apache.tez.dag.app.dag.RootInputInitializerManager: Starting 
> InputInitializer for Input: Input on vertex vertex_1409818083015_0001_1_00 
> [v1]
> 2014-09-04 16:08:25,522 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.RootInputInitializerManager: Succeeded 
> InputInitializer for Input: Input on vertex vertex_1409818083015_0001_1_00 
> [v1]
> 2014-09-04 16:08:25,523 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: vertex_1409818083015_0001_1_00 
> [v1] transitioned from NEW to INITIALIZING due to event V_INIT
> 2014-09-04 16:08:25,523 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Recovered Vertex State, 
> vertexId=vertex_1409818083015_0001_1_01 [v2], state=NEW, 
> numInitedSourceVertices0, numStartedSourceVertices=0, 
> numRecoveredSourceVertices=1, tasksIsNull=false, numTasks=0
> 2014-09-04 16:08:25,523 ERROR [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.VertexImpl: Can't handle Invalid event 
> V_SOURCE_VERTEX_RECOVERED on vertex v2 with vertexId 
> vertex_1409818083015_0001_1_01 at current state NEW
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> V_SOURCE_VERTEX_RECOVERED at NEW
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:388)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1344)
>   at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1)
>   at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1641)
>   at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>   at java.lang.Thread.run(Thread.java:745)
> 2014-09-04 16:08:25,524 FATAL [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395285#comment-14395285
 ] 

Siddharth Seth commented on TEZ-2237:
-

I've attached two more patches - noopexample_2237 - which essentially 
reproduces this hang.
Run with hadoop tez-examples.jar noop 3 3 ScatterGather 1 false | Doesn't start 
one of the outputs
Run with hadoop tez-examples.jar noop 3 3 ScatterGather 1 true | Starts all 
outputs.

The updated patch (TEZ-2237.test*) is just a minor modification of the original 
patch - with some additional logging, which fixes the issue at least for the 
example.

[~cwensel] - From looking at the logs, both outputs are not being started by 
Cascading.
As an example - look for "syslog_attempt_142732418_1908_1_51_33_0" in 
the logs posted by Cyrille.
This has two outputs, but only one instance of an output being started (as 
logged by Cascading. I believe cascading always logs when it starts an output)
{code}
2015-03-31 12:27:37,730 INFO [TezChild] element.TezGroupGate: calling 
OrderedPartitionedKVOutput#start() on: 
GroupBy(_pipe_332+_pipe_333)[by:[{1}:'key']] DEF94DA9BECF4A5BA6C85388B1EAAD41
{code}

This is vertex AF538C3C515642AD98D7283120D61548 - which has two 
OrderedParitionedKVOutputs, two UnorderedKVInputs. Not starting one of them and 
Tez generating an empty event list causes the next vertex to hang (which reads 
the Output via OrderedGroupedKVInput - assuming a ScatterGather edge).

Note: this is all from application_142732418_1908.red.txt.


> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> TEZ-2237.test.2_branch0.6.txt, all_stacks.lst, alloc_mem.png, 
> alloc_vcores.png, application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, noopexample_2237.txt, 
> ordered-grouped-kv-input-traces.diff, start_containers.png, 
> stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2237:

Attachment: noopexample_2237.txt

> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> TEZ-2237.test.2_branch0.6.txt, all_stacks.lst, alloc_mem.png, 
> alloc_vcores.png, application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, noopexample_2237.txt, 
> ordered-grouped-kv-input-traces.diff, start_containers.png, 
> stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2237:

Attachment: TEZ-2237.test.2_branch0.6.txt

> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> TEZ-2237.test.2_branch0.6.txt, all_stacks.lst, alloc_mem.png, 
> alloc_vcores.png, application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, noopexample_2237.txt, 
> ordered-grouped-kv-input-traces.diff, start_containers.png, 
> stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2234 PreCommit Build #389

2015-04-03 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2234
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/389/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2261 lines...]

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12709340/TEZ-2234.1.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.
See 
https://builds.apache.org/job/PreCommit-TEZ-Build/389//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.output.TestOnFileUnorderedKVOutput
  org.apache.tez.runtime.library.output.TestOnFileSortedOutput

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/389//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/389//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/389//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
7376f53f1ceaa359de174a43c4b3a3315ca582f0 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #387
Archived 44 artifacts
Archive block size is 32768
Received 0 blocks and 2722206 bytes
Compression is 0.0%
Took 0.83 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
26 tests failed.
REGRESSION:  
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.baseTest[test[false,
 1, -1]]

Error Message:
null

Stack Trace:
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:169)
at 
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.baseTest(TestOnFileSortedOutput.java:268)


REGRESSION:  
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.testAllEmptyPartition[test[false,
 1, -1]]

Error Message:
null

Stack Trace:
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:169)
at 
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.testAllEmptyPartition(TestOnFileSortedOutput.java:314)


REGRESSION:  
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.testWithSomeEmptyPartition[test[false,
 1, -1]]

Error Message:
null

Stack Trace:
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:169)
at 
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.testWithSomeEmptyPartition(TestOnFileSortedOutput.java:297)


REGRESSION:  
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.baseTest[test[false,
 1, 0]]

Error Message:
null

Stack Trace:
java.lang.NullPointerException: null
at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.close(OrderedPartitionedKVOutput.java:169)
at 
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.baseTest(TestOnFileSortedOutput.java:268)


REGRESSION:  
org.apache.tez.runtime.library.output.TestOnFileSortedOutput.testAllEmptyPartition[test[false,
 1, 0]]

Error Message:
null

Stack

[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395271#comment-14395271
 ] 

Hadoop QA commented on TEZ-2234:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12709340/TEZ-2234.1.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.
See 
https://builds.apache.org/job/PreCommit-TEZ-Build/389//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.output.TestOnFileUnorderedKVOutput
  org.apache.tez.runtime.library.output.TestOnFileSortedOutput

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/389//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/389//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/389//console

This message is automatically generated.

> Allow vertex managers to get output size per source vertex
> --
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-03 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: TEZ-2234.1.patch

> Allow vertex managers to get output size per source vertex
> --
>
> Key: TEZ-2234
> URL: https://issues.apache.org/jira/browse/TEZ-2234
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2234.1.patch
>
>
> Vertex managers may need per source vertex output stats to make 
> reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2236) TEZ UI: Support loading of all rows in dag -> tasks table.

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395161#comment-14395161
 ] 

Hadoop QA commented on TEZ-2236:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12709315/TEZ-2236.2.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/388//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/388//console

This message is automatically generated.

> TEZ UI: Support loading of all rows in dag -> tasks table.
> --
>
> Key: TEZ-2236
> URL: https://issues.apache.org/jira/browse/TEZ-2236
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2236.1.patch, TEZ-2236.2.patch
>
>
> 1. ember-table component was replaced with basic-ember-table component. Its 
> lightweight, easy to customize, uses pure css for layout and supports cell 
> level lazy loading and Pagination of complete loaded data.
> 2. Load all rows in two phases - First load some rows for preview, then load 
> all related records to be displayed.
> 3. Support caching of data across tabs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2236 PreCommit Build #388

2015-04-03 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2236
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/388/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2782 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12709315/TEZ-2236.2.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/388//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/388//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
384d5721422cd8187bdc014447e895726ece5971 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #387
Archived 44 artifacts
Archive block size is 32768
Received 2 blocks and 2694675 bytes
Compression is 2.4%
Took 1.8 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Created] (TEZ-2278) Tez UI start/end time and duration shown are wrong for tasks

2015-04-03 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created TEZ-2278:
---

 Summary: Tez UI start/end time and duration shown are wrong for 
tasks
 Key: TEZ-2278
 URL: https://issues.apache.org/jira/browse/TEZ-2278
 Project: Apache Tez
  Issue Type: Bug
  Components: UI
Affects Versions: 0.6.0
Reporter: Rohini Palaniswamy


 Observing lot of time discrepancies between vertex, task and swinlane views. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2236) TEZ UI: Support loading of all rows in dag -> tasks table.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395104#comment-14395104
 ] 

Sreenath Somarajapuram commented on TEZ-2236:
-

Thanks [~pramachandran], patch 2 is with the changes. Please review.

data-array-loader-mixin
- All find code must be having respective error handler, but again as a fall 
back have added a catch at the end of prototype chain.
- limit: Thanks for pointing this out, all request will go with a limit now.
- array.length: As we are just depended on the number of elements, wouldn't 
array.length be better than array.[]?

general
- columnselector: Was planned for a later patch, but have prepended to patch 2.
- Column resize: Was planned for a later patch, but have prepended to patch 2.
- Caching was added
- Default row count changed to 25.

target version: Aiming for Dal.

> TEZ UI: Support loading of all rows in dag -> tasks table.
> --
>
> Key: TEZ-2236
> URL: https://issues.apache.org/jira/browse/TEZ-2236
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2236.1.patch, TEZ-2236.2.patch
>
>
> 1. ember-table component was replaced with basic-ember-table component. Its 
> lightweight, easy to customize, uses pure css for layout and supports cell 
> level lazy loading and Pagination of complete loaded data.
> 2. Load all rows in two phases - First load some rows for preview, then load 
> all related records to be displayed.
> 3. Support caching of data across tabs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2236) TEZ UI: Support loading of all rows in dag -> tasks table.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2236:

Attachment: TEZ-2236.2.patch

> TEZ UI: Support loading of all rows in dag -> tasks table.
> --
>
> Key: TEZ-2236
> URL: https://issues.apache.org/jira/browse/TEZ-2236
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2236.1.patch, TEZ-2236.2.patch
>
>
> 1. ember-table component was replaced with basic-ember-table component. Its 
> lightweight, easy to customize, uses pure css for layout and supports cell 
> level lazy loading and Pagination of complete loaded data.
> 2. Load all rows in two phases - First load some rows for preview, then load 
> all related records to be displayed.
> 3. Support caching of data across tabs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2232) Allow setParallelism to be called multiple times before tasks get scheduled

2015-04-03 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2232:

Attachment: TEZ-2232.2.patch

Attaching rebased patch. Thanks for the review. Committing.

> Allow setParallelism to be called multiple times before tasks get scheduled
> ---
>
> Key: TEZ-2232
> URL: https://issues.apache.org/jira/browse/TEZ-2232
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2232.1.patch, TEZ-2232.2.patch
>
>
> Currently, this is allowed only once currently. It is harder to support this 
> after the vertex tasks have already started running. But allowing it before 
> tasks start running is actually trivial. This just allows VertexManagers to 
> change their minds multiple times before they start the vertex processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2159) Tez UI: download timeline data for offline use.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394896#comment-14394896
 ] 

Sreenath Somarajapuram commented on TEZ-2159:
-

Build is failing in grunt less. Font awesome is not accessible from inside 
shared.less.

> Tez UI: download timeline data for offline use.
> ---
>
> Key: TEZ-2159
> URL: https://issues.apache.org/jira/browse/TEZ-2159
> Project: Apache Tez
>  Issue Type: Improvement
>  Components: UI
>Reporter: Prakash Ramachandran
>Assignee: Prakash Ramachandran
> Attachments: TEZ-2159.1.patch, TEZ-2159.wip.1.patch
>
>
> It is useful to have capability to download the timeline data for a dag for 
> offline analysis. for ex. TEZ-2076 uses the timeline data to do offline 
> analysis of a tez application run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2277) modifyACLsStr in DAGAccessControls does not take effect

2015-04-03 Thread Thejas M Nair (JIRA)
Thejas M Nair created TEZ-2277:
--

 Summary: modifyACLsStr in DAGAccessControls does not take effect
 Key: TEZ-2277
 URL: https://issues.apache.org/jira/browse/TEZ-2277
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Thejas M Nair
Priority: Critical


Even if modifyACLsStr in DAGAccessControls constructor is set and that access 
control is set for the DAG, it does not actually get set in access control at 
runtime.

See comment in 
[HIVE-10145|https://issues.apache.org/jira/browse/HIVE-10145?focusedCommentId=14393933&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14393933]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394833#comment-14394833
 ] 

Cyrille Chépélov commented on TEZ-2237:
---

Yes, it was, as is the run in progress (with a little more memory beyond the 
heap). Am away from the cluster at the moment but will post updated logs ASAP.



> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> all_stacks.lst, alloc_mem.png, alloc_vcores.png, 
> application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff, 
> start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394735#comment-14394735
 ] 

Siddharth Seth commented on TEZ-2237:
-

[~cchepelov] - quick note.
Was the run with one of the attached patches ?
Do you see the "Attempting to close output ... " message in the logs ? Could 
you attach the logs again please.
Will look more a little later in the day.

> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> all_stacks.lst, alloc_mem.png, alloc_vcores.png, 
> application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff, 
> start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394697#comment-14394697
 ] 

Cyrille Chépélov commented on TEZ-2237:
---

No success. At one point, activity subsided while the DAGMaster was busy 
creating containers which kept dying somehow. Last breath from one of the 
containers was in the middle of spills from UnorderedPartitionedKVWriter… No 
explicit message from the NodeManager, except that the container got preempted.

Retrying with
   "tez.task.resource.memory.mb" -> "1170", // default 1024
   "tez.container.max.java.heap.fraction" -> "0.7", // default 0.8



> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> all_stacks.lst, alloc_mem.png, alloc_vcores.png, 
> application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff, 
> start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread Chris K Wensel (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394600#comment-14394600
 ] 

Chris K Wensel commented on TEZ-2237:
-

Actually, looking at the code, outputs are started before starting inputs. then 
the wait for inputs starts.

https://github.com/cwensel/cascading/blob/wip-3.0/cascading-hadoop2-tez/src/main/java/cascading/flow/tez/FlowProcessor.java#L131-131

The pipe line graph is walked in reverse topo order and all stages are 
initialized. sinks first, the sources (and any intermediate resource that must 
be initialized before execution).

So to the point above, all outputs are started before any work is begun. in 
particular, they are even started before the inputs, and before the ready 
status of the inputs.

> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> all_stacks.lst, alloc_mem.png, alloc_vcores.png, 
> application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff, 
> start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2237) Complex DAG freezes and fails (was BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG lingers)

2015-04-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394472#comment-14394472
 ] 

Cyrille Chépélov commented on TEZ-2237:
---

Thanks [~sseth]. Trying that at the moment (on branch-0.6 as of 
366583aade76901f93e15e33111ad7586326ce1e + my TEZ-2256 patch). The first DAGs 
of the cascade just completed successfully, awaiting the hard part.


> Complex DAG freezes and fails (was BufferTooSmallException raised in 
> UnorderedPartitionedKVWriter then DAG lingers)
> ---
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system 
> disk + 4*1 or 2 TiB HDD for HDFS & local  (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
>Reporter: Cyrille Chépélov
> Attachments: TEZ-2237-hack.branch6.txt, TEZ-2237-hack.master.txt, 
> all_stacks.lst, alloc_mem.png, alloc_vcores.png, 
> application_142732418_1444.yarn-logs.red.txt.gz, 
> application_142732418_1908.red.txt.bz2, 
> appmastersyslog_dag_1427282048097_0215_1.red.txt.gz, 
> appmastersyslog_dag_1427282048097_0237_1.red.txt.gz, 
> gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff, 
> start_containers.png, stop_containers.png, 
> syslog_attempt_1427282048097_0215_1_21_14_0.red.txt.gz, 
> syslog_attempt_1427282048097_0237_1_70_28_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG), 
> after about a hour of processing, several BufferTooSmallException are raised 
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active", 
> tying up memory and CPU resources as far as YARN is concerned, while little 
> if any actual processing takes place. 
> It seems two separate issues are at hand:
>   1. BufferTooSmallException are raised even though, small as the actually 
> allocated buffers seem to be (around a couple megabytes were allotted whereas 
> 100MiB were requested), the actual keys and values are never bigger than 24 
> and 1024 bytes respectively.
>   2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop 
> (stop requests appear to be sent 7 hours after the BTSE exceptions are 
> raised, but 9 hours after these stop requests, the DAG was still lingering on 
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from 
> validating the results compared to traditional MR1-based results. The lack of 
> conclusion renders the cluster queue unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2276) TEZ UI: Move sorting to web worker for making the UI responsive while searching.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2276:
---

 Summary: TEZ UI: Move sorting to web worker for making the UI 
responsive while searching.
 Key: TEZ-2276
 URL: https://issues.apache.org/jira/browse/TEZ-2276
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2275) TEZ UI: Remove counter serialization for all entities to make loading faster.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2275:
---

 Summary: TEZ UI: Remove counter serialization for all entities to 
make loading faster.
 Key: TEZ-2275
 URL: https://issues.apache.org/jira/browse/TEZ-2275
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2274) TEZ UI: Make TEZ-2236 & TEZ-2273 available for all pages, except 'All Dags'.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2274:
---

 Summary: TEZ UI: Make TEZ-2236 & TEZ-2273 available for all pages, 
except 'All Dags'.
 Key: TEZ-2274
 URL: https://issues.apache.org/jira/browse/TEZ-2274
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram


1. Make all tables use ember-table component
2. Support loading of all rows with caching
3. Support searching & sorting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2273) TEZ UI: Support client side, searching & sorting.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2273:

Summary: TEZ UI: Support client side, searching & sorting.  (was: Support 
client side, searching & sorting.)

> TEZ UI: Support client side, searching & sorting.
> -
>
> Key: TEZ-2273
> URL: https://issues.apache.org/jira/browse/TEZ-2273
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2236) TEZ UI: Support loading of all rows in dag -> tasks table.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2236:

Summary: TEZ UI: Support loading of all rows in dag -> tasks table.  (was: 
Support loading of all rows in dag -> tasks table.)

> TEZ UI: Support loading of all rows in dag -> tasks table.
> --
>
> Key: TEZ-2236
> URL: https://issues.apache.org/jira/browse/TEZ-2236
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2236.1.patch
>
>
> 1. ember-table component was replaced with basic-ember-table component. Its 
> lightweight, easy to customize, uses pure css for layout and supports cell 
> level lazy loading and Pagination of complete loaded data.
> 2. Load all rows in two phases - First load some rows for preview, then load 
> all related records to be displayed.
> 3. Support caching of data across tabs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2273) Support client side, searching & sorting.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2273:
---

 Summary: Support client side, searching & sorting.
 Key: TEZ-2273
 URL: https://issues.apache.org/jira/browse/TEZ-2273
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2236) Support loading of all rows in dag -> tasks table.

2015-04-03 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2236:

Description: 
1. ember-table component was replaced with basic-ember-table component. Its 
lightweight, easy to customize, uses pure css for layout and supports cell 
level lazy loading and Pagination of complete loaded data.
2. Load all rows in two phases - First load some rows for preview, then load 
all related records to be displayed.
3. Support caching of data across tabs.

  was:
1. ember-table component was replaced with basic-ember-table component. Its 
lightweight, easy to customize, uses pure css for layout and supports cell 
level lazy loading and Pagination of complete loaded data.
2. Load all rows in two phases - First load some rows for preview, then load 
all related records to be displayed.


> Support loading of all rows in dag -> tasks table.
> --
>
> Key: TEZ-2236
> URL: https://issues.apache.org/jira/browse/TEZ-2236
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2236.1.patch
>
>
> 1. ember-table component was replaced with basic-ember-table component. Its 
> lightweight, easy to customize, uses pure css for layout and supports cell 
> level lazy loading and Pagination of complete loaded data.
> 2. Load all rows in two phases - First load some rows for preview, then load 
> all related records to be displayed.
> 3. Support caching of data across tabs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2269) DAGAppMaster becomes unresponsive

2015-04-03 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2269:
--
Attachment: TEZ-2269.test.patch

stacktrace isn't revealing much on deadlock & haven't been successful in 
getting which thread is holding up the lock.

Tried out the test patch attached here multiple number of times, which safely 
uses "tryLock" with timeout  in DAGImpl.getDAGStatus().  With the patch, the 
hang issue is not reproduced. [~sseth] - Thoughts?

> DAGAppMaster becomes unresponsive
> -
>
> Key: TEZ-2269
> URL: https://issues.apache.org/jira/browse/TEZ-2269
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Rajesh Balamohan
> Attachments: TEZ-2269.test.patch, 
> app_master_application_1428021179455_0001_jstack.txt, client_jstack.txt
>
>
> Scenario:
> - Run TPCH query20 @ 1 TB scale
> - Tez master branch, Hive trunk
> - auto-reduce parallelism is not an issue (happens with/without auto-reduce 
> parallelism)
> 1 or 2 times in 10 runs, DAGAppMaster would freeze unexpectedly.  There is no 
> pattern observed on which vertex it happens. But when this happens, only 
> option is to kill the application.   I will attach the jstack soon, but that 
> doesn't seem to reveal much.
> Need to debug more; Creating this JIRA for tracking purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2272) Should display committer for Output (Sink)

2015-04-03 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2272:

Attachment: initializer_committer.png

> Should display committer for Output (Sink)
> --
>
> Key: TEZ-2272
> URL: https://issues.apache.org/jira/browse/TEZ-2272
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jeff Zhang
> Attachments: initializer_committer.png
>
>
> On the page of Source & Sink, the initializer for Input is displayed but no 
> committer for Output. It is supposed to also display the committer. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2272) Should display committer for Output (Sink)

2015-04-03 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2272:

Summary: Should display committer for Output (Sink)  (was: Should add 
committer for Output (Sink))

> Should display committer for Output (Sink)
> --
>
> Key: TEZ-2272
> URL: https://issues.apache.org/jira/browse/TEZ-2272
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jeff Zhang
>
> On the page of Source & Sink, the initializer for Input is displayed but no 
> committer for Output. It is supposed to also display the committer. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2272) Should add committer for Output (Sink)

2015-04-03 Thread Jeff Zhang (JIRA)
Jeff Zhang created TEZ-2272:
---

 Summary: Should add committer for Output (Sink)
 Key: TEZ-2272
 URL: https://issues.apache.org/jira/browse/TEZ-2272
 Project: Apache Tez
  Issue Type: Bug
  Components: UI
Reporter: Jeff Zhang


On the page of Source & Sink, the initializer for Input is displayed but no 
committer for Output. It is supposed to also display the committer. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)