[jira] Commented: (HIVE-1671) multithreading on Context.pathToCS

2010-09-27 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915508#action_12915508
 ] 

Namit Jain commented on HIVE-1671:
--

OK, I can now see the problem.

+1


> multithreading on Context.pathToCS
> --
>
> Key: HIVE-1671
> URL: https://issues.apache.org/jira/browse/HIVE-1671
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Bennie Schut
>Assignee: Bennie Schut
> Fix For: 0.7.0
>
> Attachments: HIVE-1671-1.patch
>
>
> we having 2 threads running at 100%
> With a stacktrace like this:
> "Thread-16725" prio=10 tid=0x7ff410662000 nid=0x497d runnable 
> [0x442eb000]
>java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.get(HashMap.java:303)
> at org.apache.hadoop.hive.ql.Context.getCS(Context.java:524)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1369)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:329)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:297)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:84)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1671) multithreading on Context.pathToCS

2010-09-27 Thread Bennie Schut (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915468#action_12915468
 ] 

Bennie Schut commented on HIVE-1671:


Sorry I was a bit short on the description. I'm running the HiveServer with 
hive.exec.parallel set to true. I'm running many jobs each day for about a week 
after startup. Then I notice 2 threads are stuck at 100% cpu for about 3days. I 
used jstack to look at both threads and they showed the same stacktrace.

> multithreading on Context.pathToCS
> --
>
> Key: HIVE-1671
> URL: https://issues.apache.org/jira/browse/HIVE-1671
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Bennie Schut
>Assignee: Bennie Schut
> Fix For: 0.7.0
>
> Attachments: HIVE-1671-1.patch
>
>
> we having 2 threads running at 100%
> With a stacktrace like this:
> "Thread-16725" prio=10 tid=0x7ff410662000 nid=0x497d runnable 
> [0x442eb000]
>java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.get(HashMap.java:303)
> at org.apache.hadoop.hive.ql.Context.getCS(Context.java:524)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1369)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:329)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:297)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:84)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1671) multithreading on Context.pathToCS

2010-09-27 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915440#action_12915440
 ] 

Namit Jain commented on HIVE-1671:
--

Are your using HiveServer ?

.bq we having 2 threads running at 100%

What do you mean by the above ? Are you setting hive.exec.parallel to true, in 
which case, I can see the problem happening ?

> multithreading on Context.pathToCS
> --
>
> Key: HIVE-1671
> URL: https://issues.apache.org/jira/browse/HIVE-1671
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Bennie Schut
>Assignee: Bennie Schut
> Fix For: 0.7.0
>
> Attachments: HIVE-1671-1.patch
>
>
> we having 2 threads running at 100%
> With a stacktrace like this:
> "Thread-16725" prio=10 tid=0x7ff410662000 nid=0x497d runnable 
> [0x442eb000]
>java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.get(HashMap.java:303)
> at org.apache.hadoop.hive.ql.Context.getCS(Context.java:524)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1369)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:329)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:297)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:84)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1671) multithreading on Context.pathToCS

2010-09-27 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915275#action_12915275
 ] 

HBase Review Board commented on HIVE-1671:
--

Message from: "Bennie Schut" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/909/
---

Review request for Hive Developers.


Summary
---

simple change HashMap into ConcurrentHashMap


This addresses bug HIVE-1671.
http://issues.apache.org/jira/browse/HIVE-1671


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java 1001658 

Diff: http://review.cloudera.org/r/909/diff


Testing
---


Thanks,

Bennie




> multithreading on Context.pathToCS
> --
>
> Key: HIVE-1671
> URL: https://issues.apache.org/jira/browse/HIVE-1671
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Bennie Schut
>Assignee: Bennie Schut
> Fix For: 0.7.0
>
> Attachments: HIVE-1671-1.patch
>
>
> we having 2 threads running at 100%
> With a stacktrace like this:
> "Thread-16725" prio=10 tid=0x7ff410662000 nid=0x497d runnable 
> [0x442eb000]
>java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.get(HashMap.java:303)
> at org.apache.hadoop.hive.ql.Context.getCS(Context.java:524)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1369)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:329)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:297)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:84)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1671) multithreading on Context.pathToCS

2010-09-27 Thread Bennie Schut (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915262#action_12915262
 ] 

Bennie Schut commented on HIVE-1671:


Perhaps because we now have multiple sub queries running in hive for the same 
overall query we can have concurrent use of this map?
We could simply fix this by using the ConcurrentHashMap


  private Map pathToCS = new ConcurrentHashMap();


> multithreading on Context.pathToCS
> --
>
> Key: HIVE-1671
> URL: https://issues.apache.org/jira/browse/HIVE-1671
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Bennie Schut
> Fix For: 0.7.0
>
>
> we having 2 threads running at 100%
> With a stacktrace like this:
> "Thread-16725" prio=10 tid=0x7ff410662000 nid=0x497d runnable 
> [0x442eb000]
>java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.get(HashMap.java:303)
> at org.apache.hadoop.hive.ql.Context.getCS(Context.java:524)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1369)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:329)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:297)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:84)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.