[jira] [Comment Edited] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175078#comment-15175078
 ] 

Hitesh Shah edited comment on TEZ-3154 at 3/2/16 5:47 AM:
--

bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain thresholds are hit ( GC 
counters, task runtime exceeds other tasks by a large factor, etc ) then 
trigger it.
   - command-line tool to trigger thread dumps on an on-demand basis if the 
user wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 




was (Author: hitesh):
bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain thresholds are hit ( GC 
counters, task runtime exceeds other tasks by a large factor, etc ) then 
trigger it.
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 



> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175078#comment-15175078
 ] 

Hitesh Shah edited comment on TEZ-3154 at 3/2/16 5:47 AM:
--

bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain thresholds are hit ( GC 
counters, task runtime exceeds other tasks by a large factor, etc ) then 
trigger it.
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 




was (Author: hitesh):
bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( 
GC counters, etc ) then trigger it.
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 



> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175078#comment-15175078
 ] 

Hitesh Shah edited comment on TEZ-3154 at 3/2/16 5:46 AM:
--

bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( 
GC counters, etc ) then trigger it.
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 




was (Author: hitesh):
bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( 
GC counters, etc ) 
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 



> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175078#comment-15175078
 ] 

Hitesh Shah edited comment on TEZ-3154 at 3/2/16 5:46 AM:
--

bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup and would 
be a lot of noise if enabled by default. The current description seems to imply 
that the dag would need to be configured before it is executed unless there is 
a way to do this while a dag is running.

What would be more useful is to trigger a thread dump when certain conditions 
are hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( 
GC counters, etc ) 
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 




was (Author: hitesh):
bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup. What 
would be more useful is to trigger a thread dump when certain conditions are 
hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( 
GC counters, etc ) 
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 



> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175078#comment-15175078
 ] 

Hitesh Shah commented on TEZ-3154:
--

bq. Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup. What 
would be more useful is to trigger a thread dump when certain conditions are 
hit: 
   - if there is a task timeout - this would trigger a thread dump before 
killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( 
GC counters, etc ) 
   - command-line tool to trigger thread dumps on an demand basis if the user 
wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced 
for sending a signal to a container - this would be useful to trigger a jstack 
dump if needed. 



> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3154:
--
Description: 
tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
(e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
parameters to task specific command line options. It has been useful for 
launching profilers on specific tasks.

There are scenarios in which taking threaddumps on periodic basis on specific 
tasks could be helpful. E.g
- In certain clusters it could be difficult to add profilers. 
- There could be scenarios where the tasks are slow due apps using Tez (but the 
counters might indicate no issues in Tez).  (e.g Parsing using SimpleDateFormat 
for every record could be time consuming)
- In certain clusters, access might not be there to take threaddumps of tasks 
from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
task level.

Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
-DthreadDumpInterval=5 " --hiveconf tez.task-specific.launch.cmd-opts.list="Map 
1[10,15]" - This should print thread-dumps in tasks 10, 15 in Map-1 every 5 
seconds.

  was:
tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
(e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
parameters to task specific command line options. It has been useful for 
launching profilers on specific tasks.

There are scenarios in which taking threaddumps on periodic basis on specific 
tasks could be helpfule. E.g
- In certain clusters it could be difficult to add profilers. 
- There could be scenarios where the tasks are slow due apps using Tez (but the 
counters might indicate no issues in Tez).  (e.g Parsing using SimpleDateFormat 
for every record could be time consuming)
- In certain clusters, access might not be there to take threaddumps of tasks 
from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
task level.

Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
-DthreadDumpInterval=5 " --hiveconf tez.task-specific.launch.cmd-opts.list="Map 
1[10,15]" - This should print thread-dumps in tasks 10, 15 in Map-1 every 5 
seconds.


> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3154:
--
Description: 
tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
(e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
parameters to task specific command line options. It has been useful for 
launching profilers on specific tasks.

There are scenarios in which taking threaddumps on periodic basis on specific 
tasks could be helpfule. E.g
- In certain clusters it could be difficult to add profilers. 
- There could be scenarios where the tasks are slow due apps using Tez (but the 
counters might indicate no issues in Tez).  (e.g Parsing using SimpleDateFormat 
for every record could be time consuming)
- In certain clusters, access might not be there to take threaddumps of tasks 
from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
task level.

Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.

High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
-DthreadDumpInterval=5 " --hiveconf tez.task-specific.launch.cmd-opts.list="Map 
1[10,15]" - This should print thread-dumps in tasks 10, 15 in Map-1 every 5 
seconds.

  was:
tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
(e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
parameters to task specific command line options. It has been useful for 
launching profilers on specific tasks.

There are scenarios in which taking threaddumps on periodic basis on specific 
tasks could be helpfule. E.g
- In certain clusters it could be difficult to add profilers. 
- There could be scenarios where the tasks are slow due apps using Tez (but the 
counters might indicate no issues in Tez).  (e.g Parsing using SimpleDateFormat 
for every record could be time consuming)
- In certain clusters, access might not be there to take threaddumps of tasks 
from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
task level.

Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.


> Debuggability : Add an option to take threaddump from a specific vertex/task
> 
>
> Key: TEZ-3154
> URL: https://issues.apache.org/jira/browse/TEZ-3154
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
> (e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
> parameters to task specific command line options. It has been useful for 
> launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific 
> tasks could be helpfule. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but 
> the counters might indicate no issues in Tez).  (e.g Parsing using 
> SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks 
> from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
> task level.
> Creating this ticket to explore the possibility of adding thread-dump on 
> periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" 
> -DthreadDumpInterval=5 " --hiveconf 
> tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print 
> thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

2016-03-01 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created TEZ-3154:
-

 Summary: Debuggability : Add an option to take threaddump from a 
specific vertex/task
 Key: TEZ-3154
 URL: https://issues.apache.org/jira/browse/TEZ-3154
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Rajesh Balamohan


tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list 
(e.g "Map 1[10]", 10th task in map 1) options are available to add certain 
parameters to task specific command line options. It has been useful for 
launching profilers on specific tasks.

There are scenarios in which taking threaddumps on periodic basis on specific 
tasks could be helpfule. E.g
- In certain clusters it could be difficult to add profilers. 
- There could be scenarios where the tasks are slow due apps using Tez (but the 
counters might indicate no issues in Tez).  (e.g Parsing using SimpleDateFormat 
for every record could be time consuming)
- In certain clusters, access might not be there to take threaddumps of tasks 
from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on 
task level.

Creating this ticket to explore the possibility of adding thread-dump on 
periodic basis for specific tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3140) Reduce AM memory usage while serialization

2016-03-01 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174892#comment-15174892
 ] 

Siddharth Seth commented on TEZ-3140:
-

I think it's better to set this up in it's own test class - not very costly, 
and the test is really only testing how EntityDescriptor handles serialization. 
However, go ahead if you don't think a new class is warranted.

> Reduce AM memory usage while serialization
> --
>
> Key: TEZ-3140
> URL: https://issues.apache.org/jira/browse/TEZ-3140
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.7.1, 0.8.3
>
> Attachments: TEZ-3140-1.patch, TEZ-3140-2.patch
>
>
>There is an unnecessary copy of userpayload byte array during 
> serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3152) Tez UI 2: Build fails when run by multiple users on the same system

2016-03-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174502#comment-15174502
 ] 

Hitesh Shah commented on TEZ-3152:
--

https://github.com/ember-cli/ember-cli/issues/4813 

> Tez UI 2: Build fails when run by multiple users on the same system
> ---
>
> Key: TEZ-3152
> URL: https://issues.apache.org/jira/browse/TEZ-3152
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3152.wip.1.patch
>
>
> - async-disk-cache package creates files in tmpDir (/tmp). When run from a 
> different user, because of user permission on there files, the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-3153) build uses tmp, that is problematic on a shared machine

2016-03-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved TEZ-3153.
--
Resolution: Duplicate

> build uses tmp, that is problematic on a shared machine
> ---
>
> Key: TEZ-3153
> URL: https://issues.apache.org/jira/browse/TEZ-3153
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> {noformat}
> INFO] bower@1.7.7 node_modules/bower
> [INFO] 
> [INFO] --- exec-maven-plugin:1.3.2:exec (ember build) @ tez-ui2 ---
> version: 1.13.13
> Could not find watchman, falling back to NodeWatcher for file system events.
> Visit http://www.ember-cli.com/user-guide/#watchman for more info.
> BuildingBuilding.Build failed.
> File: modules/ember-wormhole/components/ember-wormhole.js
> EACCES, mkdir '/tmp/async-disk-cache/3b95d6f55686e4f2ba8e38923a59b8dd'
> Error: EACCES, mkdir '/tmp/async-disk-cache/3b95d6f55686e4f2ba8e38923a59b8dd'
> {noformat}
> Looks like some stuff during the build writes to tmp; some other user has 
> already created /tmp/async-disk-cache so I get an access error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3153) build uses tmp, that is problematic on a shared machine

2016-03-01 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174412#comment-15174412
 ] 

Siddharth Seth commented on TEZ-3153:
-

cc [~Sreenath]

> build uses tmp, that is problematic on a shared machine
> ---
>
> Key: TEZ-3153
> URL: https://issues.apache.org/jira/browse/TEZ-3153
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> {noformat}
> INFO] bower@1.7.7 node_modules/bower
> [INFO] 
> [INFO] --- exec-maven-plugin:1.3.2:exec (ember build) @ tez-ui2 ---
> version: 1.13.13
> Could not find watchman, falling back to NodeWatcher for file system events.
> Visit http://www.ember-cli.com/user-guide/#watchman for more info.
> BuildingBuilding.Build failed.
> File: modules/ember-wormhole/components/ember-wormhole.js
> EACCES, mkdir '/tmp/async-disk-cache/3b95d6f55686e4f2ba8e38923a59b8dd'
> Error: EACCES, mkdir '/tmp/async-disk-cache/3b95d6f55686e4f2ba8e38923a59b8dd'
> {noformat}
> Looks like some stuff during the build writes to tmp; some other user has 
> already created /tmp/async-disk-cache so I get an access error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3153) build uses tmp, that is problematic on a shared machine

2016-03-01 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created TEZ-3153:
-

 Summary: build uses tmp, that is problematic on a shared machine
 Key: TEZ-3153
 URL: https://issues.apache.org/jira/browse/TEZ-3153
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sergey Shelukhin


{noformat}
INFO] bower@1.7.7 node_modules/bower
[INFO] 
[INFO] --- exec-maven-plugin:1.3.2:exec (ember build) @ tez-ui2 ---
version: 1.13.13
Could not find watchman, falling back to NodeWatcher for file system events.
Visit http://www.ember-cli.com/user-guide/#watchman for more info.
BuildingBuilding.Build failed.
File: modules/ember-wormhole/components/ember-wormhole.js
EACCES, mkdir '/tmp/async-disk-cache/3b95d6f55686e4f2ba8e38923a59b8dd'
Error: EACCES, mkdir '/tmp/async-disk-cache/3b95d6f55686e4f2ba8e38923a59b8dd'
{noformat}
Looks like some stuff during the build writes to tmp; some other user has 
already created /tmp/async-disk-cache so I get an access error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3140) Reduce AM memory usage while serialization

2016-03-01 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174372#comment-15174372
 ] 

Rohini Palaniswamy commented on TEZ-3140:
-

bq. The test doesn't have too much to do with DagTypeConverters.
  Yes. Could not find a appropriate test class to add to and looked like 
overkill to create one class for that test considering it is forking a jvm for 
each test class. Added that to  TestDagTypeConverters as it had 
testTezEntityDescriptorSerialization which tested the protobuf serialization 
for EntityDescriptor and actually started writing this test copying over stuff 
from that. So think it should not be that bad to leave it there considering 
there are two tests that are related though not exactly to the test class. I 
can move it to a separate class if you still think that would be more clean.

> Reduce AM memory usage while serialization
> --
>
> Key: TEZ-3140
> URL: https://issues.apache.org/jira/browse/TEZ-3140
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.7.1, 0.8.3
>
> Attachments: TEZ-3140-1.patch, TEZ-3140-2.patch
>
>
>There is an unnecessary copy of userpayload byte array during 
> serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead

2016-03-01 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174281#comment-15174281
 ] 

Siddharth Seth commented on TEZ-3115:
-

I think the interning needs to be done via StringInterner.weakIntern() ?

The rest looks good. Minor, could you please add a toString method on thenew 
classes - HostPort, PathPartition, HostPortPartition

> Shuffle string handling adds significant memory overhead
> 
>
> Key: TEZ-3115
> URL: https://issues.apache.org/jira/browse/TEZ-3115
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jonathan Eagles
> Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch, 
> TEZ-3115.3-branch-0.7.patch, TEZ-3115.3.patch
>
>
> While investigating the OOM heap dump from TEZ-3114 I noticed that the 
> ShuffleManager and other shuffle-related objects were holding onto many 
> strings that added up to over a hundred megabytes of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3152) Tez UI 2: Build fails when run by multiple users on the same system

2016-03-01 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-3152:

Summary: Tez UI 2: Build fails when run by multiple users on the same 
system  (was: Tez UI 2: Build fails when tried from multiple users)

> Tez UI 2: Build fails when run by multiple users on the same system
> ---
>
> Key: TEZ-3152
> URL: https://issues.apache.org/jira/browse/TEZ-3152
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3152.wip.1.patch
>
>
> - async-disk-cache package creates files in tmpDir. When run from a different 
> user, because of user permission on there files, the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3152) Tez UI 2: Build fails when run by multiple users on the same system

2016-03-01 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-3152:

Description: - async-disk-cache package creates files in tmpDir (/tmp). 
When run from a different user, because of user permission on there files, the 
build fails.  (was: - async-disk-cache package creates files in tmpDir. When 
run from a different user, because of user permission on there files, the build 
fails.)

> Tez UI 2: Build fails when run by multiple users on the same system
> ---
>
> Key: TEZ-3152
> URL: https://issues.apache.org/jira/browse/TEZ-3152
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3152.wip.1.patch
>
>
> - async-disk-cache package creates files in tmpDir (/tmp). When run from a 
> different user, because of user permission on there files, the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3152) Tez UI 2: Build fails when tried from multiple users

2016-03-01 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-3152:

Attachment: TEZ-3152.wip.1.patch

- Attaching a WIP patch.
- Will make it final after more testing.

> Tez UI 2: Build fails when tried from multiple users
> 
>
> Key: TEZ-3152
> URL: https://issues.apache.org/jira/browse/TEZ-3152
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3152.wip.1.patch
>
>
> - async-disk-cache package creates files in tmpDir. When run from a different 
> user, because of user permission on there files, the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-3152) Tez UI 2: Build fails when tried from multiple users

2016-03-01 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-3152:
---

 Summary: Tez UI 2: Build fails when tried from multiple users
 Key: TEZ-3152
 URL: https://issues.apache.org/jira/browse/TEZ-3152
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram
Assignee: Sreenath Somarajapuram


- async-disk-cache package creates files in tmpDir. When run from a different 
user, because of user permission on there files, the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3140) Reduce AM memory usage while serialization

2016-03-01 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174181#comment-15174181
 ] 

Siddharth Seth commented on TEZ-3140:
-

+1. Could you please move the test into a new file before commit - 
TestEntityDescriptor before commit. The test doesn't have too much to do with 
DagTypeConverters.

> Reduce AM memory usage while serialization
> --
>
> Key: TEZ-3140
> URL: https://issues.apache.org/jira/browse/TEZ-3140
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.7.1, 0.8.3
>
> Attachments: TEZ-3140-1.patch, TEZ-3140-2.patch
>
>
>There is an unnecessary copy of userpayload byte array during 
> serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3148) Invalid event TA_TEZ_EVENT_UPDATE on TaskAttempt

2016-03-01 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3148:
--
Description: 
Got the following when executing one of the DAG. 

Tez details:
versionInfo=[ component=tez-dag, version=0.8.3-SNAPSHOT, 
revision=3e409ae0ee7233b4cf631cac1bc366679a08b7d1, 
SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, 
buildTime=20160227-1912]

{noformat}
Invalid event TA_TEZ_EVENT_UPDATE on TaskAttempt 
attempt_1455662455106_2317_27_02_000339_0
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Invalid event TA_TEZ_EVENT_UPDATE 
on TaskAttempt attempt_1455662455106_2317_27_02_000339_0
Exception in thread "75b0f971-7f89-461a-b432-45e1ac6e374b main" 
java.lang.AbstractMethodError: 
org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.close()V
at org.apache.tez.client.TezClient.stop(TezClient.java:562)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:474)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:436)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:338)
at 
org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1469)
at 
org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:719)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

Additional note for later reference: Q51 in tpcds can possibly be used to 
reproduce this at 10 TB scale.

  was:
Got the following when executing one of the DAG. 

Tez details:
versionInfo=[ component=tez-dag, version=0.8.3-SNAPSHOT, 
revision=3e409ae0ee7233b4cf631cac1bc366679a08b7d1, 
SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, 
buildTime=20160227-1912]

{noformat}
Invalid event TA_TEZ_EVENT_UPDATE on TaskAttempt 
attempt_1455662455106_2317_27_02_000339_0
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Invalid event TA_TEZ_EVENT_UPDATE 
on TaskAttempt attempt_1455662455106_2317_27_02_000339_0
Exception in thread "75b0f971-7f89-461a-b432-45e1ac6e374b main" 
java.lang.AbstractMethodError: 
org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.close()V
at org.apache.tez.client.TezClient.stop(TezClient.java:562)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:474)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:436)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:338)
at 
org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1469)
at 
org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:719)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}


> Invalid event TA_TEZ_EVENT_UPDATE on TaskAttempt
> 
>
> Key: TEZ-3148
> URL: https://issues.apache.org/jira/browse/TEZ-3148
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: am.log.gz, dag.dot
>
>
> Got the following when executing one of the DAG. 
> Tez details:
> versionInfo=[ component=tez-dag, version=0.8.3-SNAPSHOT, 
> revision=3e409ae0ee7233b4cf631cac1bc366679a08b7d1, 
> SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, 
> buildTime=20160227-1912]
> {noformat}
> Invalid event TA_TEZ_EVENT_UPDATE on TaskAttempt 
> attempt_1455662455106_2317_27_02_000339_0
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Invalid event TA_TEZ_EVENT_UPDATE 
> on TaskAttempt