Re: Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command

2015-06-09 Thread cheng xu


> On June 9, 2015, 12:41 p.m., Xuefu Zhang wrote:
> > Besides the two minor issues I found in the patch, I was wondering if the 
> > approach we are taking is the best. Variable substitution is a server (HS2) 
> > behavior, and on this ground I think this should happen in HS2 instead of 
> > beeline. Please note that JDBC client may also submit queries with $var in 
> > it, and such a case should be also supported.
> > 
> > I also noticed that in Driver class, there is code handling variable 
> > substitution. I'm wondering why it's not effective.
> > 
> > Shell command (starting with !sh) is executed in the client (Beeline). I 
> > think we are fine if variable substituion doesn't work for shell command. 
> > We can address that as a followup taks if desirable.
> 
> cheng xu wrote:
> Thanks for your comments. 
> 
> `I also noticed that in Driver class, there is code handling variable 
> substitution. I'm wondering why it's not effective.`
> 
> The substitution works well in HS2 currently.
> There are two reasons for me to add API getting the conf from HS2. One is 
> to support substitution in sh and source command. In the old cli, source 
> command and sh command worked well with substitution. So this part of this 
> patch is addressing this purpose. Another consideration is for 
> https://issues.apache.org/jira/browse/HIVE-10847 which required some 
> configuration from hive-site.xml.
> 
> Xuefu Zhang wrote:
> Yeah. It's a little trickier than thought. Shell command is executed at 
> client side (Beeline) and it doesn't seem making sense to use server specific 
> variables such (env, sys, hiveconf, hivevar) in the shell commands. More 
> importantly, Beeline can connect to multiple serves at the same time, so 
> which configurations should be used to substitue the variables? User should 
> be able to execute shell commands w/o any server connection.
> 
> For CLI, server and client are together, so these don't matter. But for 
> beeline + HS2 deployment, story will be different.
> 
> I don't know what's the best, and all I'm saying is that we need to be 
> very careful on what we doing. Before we decide what to do, we need to 
> clearly define the problem we are trying to solve first.
> 
> cheng xu wrote:
> Thank you for your prompt reply.
> `Shell command is executed at client side (Beeline) and it doesn't seem 
> making sense to use server specific variables such (env, sys, hiveconf, 
> hivevar) in the shell commands.`
> I'm not sure whether substitution for sh and source is useful. We can 
> enable the support of substitution after connection established for beeline 
> unless connected. For the new CLI who is using an embedded connection, it 
> should be supported WRT the backwards compatibility. 
> 
> I am a little confused about `connect to multiple serves at the same 
> time`. Does it mean you can use beeline to connect any server in one 
> connection and you can have multiple beeline instances running? (It's the 
> case that user executes the command 
> */opt/apache-hive-1.2.0-SNAPSHOT-bin/bin/hive --service beeline* with specify 
> any hostname) 
> If so, I think it may cause some errors if no connection available since 
> the current implementation is based on connection by using **SetProcessor**. 
> AFAIK, it's safe to get the configurations from HS2 via **SetProcessor** 
> which is what beeline actually did after connection is established. 
> Connection(session) should only be assiocated with one server. If user didn't 
> connect to any HS2, the substitution for *sh* and *source* should be 
> disabled. To be honest, it will have some negative impacts for the 
> performance since it requires to execute set command. WRT the performance, we 
> can make this support configurable.
> 
> In summary, substitution is enabled unless connection is established for 
> source or sh command considering the backwards compatibility. And we can 
> disable the support for beeline if not reasonable or brings lower performce. 
> For HIVE-10847, I think we still need one way to access the configuration 
> from server side but it is only needed when start a connection.
> 
> Any thoughts?

Sorry for below typo.
I am a little confused about connect to multiple serves at the same time. Does 
it mean you can use beeline to connect any server in one connection and you can 
have multiple beeline instances running? (It's the case that user executes the 
command /opt/apache-hive-1.2.0-SNAPSHOT-bin/bin/hive --service beeline 
**without** specify any hostname)


- cheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35107/#review87090
---


On June 5, 2015, 10:09 a.m., cheng xu wrote:
> 
> ---
> This is an automatically generated 

Re: Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command

2015-06-09 Thread cheng xu


> On June 9, 2015, 12:41 p.m., Xuefu Zhang wrote:
> > Besides the two minor issues I found in the patch, I was wondering if the 
> > approach we are taking is the best. Variable substitution is a server (HS2) 
> > behavior, and on this ground I think this should happen in HS2 instead of 
> > beeline. Please note that JDBC client may also submit queries with $var in 
> > it, and such a case should be also supported.
> > 
> > I also noticed that in Driver class, there is code handling variable 
> > substitution. I'm wondering why it's not effective.
> > 
> > Shell command (starting with !sh) is executed in the client (Beeline). I 
> > think we are fine if variable substituion doesn't work for shell command. 
> > We can address that as a followup taks if desirable.
> 
> cheng xu wrote:
> Thanks for your comments. 
> 
> `I also noticed that in Driver class, there is code handling variable 
> substitution. I'm wondering why it's not effective.`
> 
> The substitution works well in HS2 currently.
> There are two reasons for me to add API getting the conf from HS2. One is 
> to support substitution in sh and source command. In the old cli, source 
> command and sh command worked well with substitution. So this part of this 
> patch is addressing this purpose. Another consideration is for 
> https://issues.apache.org/jira/browse/HIVE-10847 which required some 
> configuration from hive-site.xml.
> 
> Xuefu Zhang wrote:
> Yeah. It's a little trickier than thought. Shell command is executed at 
> client side (Beeline) and it doesn't seem making sense to use server specific 
> variables such (env, sys, hiveconf, hivevar) in the shell commands. More 
> importantly, Beeline can connect to multiple serves at the same time, so 
> which configurations should be used to substitue the variables? User should 
> be able to execute shell commands w/o any server connection.
> 
> For CLI, server and client are together, so these don't matter. But for 
> beeline + HS2 deployment, story will be different.
> 
> I don't know what's the best, and all I'm saying is that we need to be 
> very careful on what we doing. Before we decide what to do, we need to 
> clearly define the problem we are trying to solve first.

Thank you for your prompt reply.
`Shell command is executed at client side (Beeline) and it doesn't seem making 
sense to use server specific variables such (env, sys, hiveconf, hivevar) in 
the shell commands.`
I'm not sure whether substitution for sh and source is useful. We can enable 
the support of substitution after connection established for beeline unless 
connected. For the new CLI who is using an embedded connection, it should be 
supported WRT the backwards compatibility. 

I am a little confused about `connect to multiple serves at the same time`. 
Does it mean you can use beeline to connect any server in one connection and 
you can have multiple beeline instances running? (It's the case that user 
executes the command */opt/apache-hive-1.2.0-SNAPSHOT-bin/bin/hive --service 
beeline* with specify any hostname) 
If so, I think it may cause some errors if no connection available since the 
current implementation is based on connection by using **SetProcessor**. AFAIK, 
it's safe to get the configurations from HS2 via **SetProcessor** which is what 
beeline actually did after connection is established. Connection(session) 
should only be assiocated with one server. If user didn't connect to any HS2, 
the substitution for *sh* and *source* should be disabled. To be honest, it 
will have some negative impacts for the performance since it requires to 
execute set command. WRT the performance, we can make this support configurable.

In summary, substitution is enabled unless connection is established for source 
or sh command considering the backwards compatibility. And we can disable the 
support for beeline if not reasonable or brings lower performce. For 
HIVE-10847, I think we still need one way to access the configuration from 
server side but it is only needed when start a connection.

Any thoughts?


- cheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35107/#review87090
---


On June 5, 2015, 10:09 a.m., cheng xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35107/
> ---
> 
> (Updated June 5, 2015, 10:09 a.m.)
> 
> 
> Review request for hive, chinna and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6791
> https://issues.apache.org/jira/browse/HIVE-6791
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Summary:
> 1) move the beeline-cli convertor to the place where cli is executed(class 
> **Commands**)
> 2) support substitution for

Re: Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command

2015-06-09 Thread Xuefu Zhang


> On June 9, 2015, 4:41 a.m., Xuefu Zhang wrote:
> > Besides the two minor issues I found in the patch, I was wondering if the 
> > approach we are taking is the best. Variable substitution is a server (HS2) 
> > behavior, and on this ground I think this should happen in HS2 instead of 
> > beeline. Please note that JDBC client may also submit queries with $var in 
> > it, and such a case should be also supported.
> > 
> > I also noticed that in Driver class, there is code handling variable 
> > substitution. I'm wondering why it's not effective.
> > 
> > Shell command (starting with !sh) is executed in the client (Beeline). I 
> > think we are fine if variable substituion doesn't work for shell command. 
> > We can address that as a followup taks if desirable.
> 
> cheng xu wrote:
> Thanks for your comments. 
> 
> `I also noticed that in Driver class, there is code handling variable 
> substitution. I'm wondering why it's not effective.`
> 
> The substitution works well in HS2 currently.
> There are two reasons for me to add API getting the conf from HS2. One is 
> to support substitution in sh and source command. In the old cli, source 
> command and sh command worked well with substitution. So this part of this 
> patch is addressing this purpose. Another consideration is for 
> https://issues.apache.org/jira/browse/HIVE-10847 which required some 
> configuration from hive-site.xml.

Yeah. It's a little trickier than thought. Shell command is executed at client 
side (Beeline) and it doesn't seem making sense to use server specific 
variables such (env, sys, hiveconf, hivevar) in the shell commands. More 
importantly, Beeline can connect to multiple serves at the same time, so which 
configurations should be used to substitue the variables? User should be able 
to execute shell commands w/o any server connection.

For CLI, server and client are together, so these don't matter. But for beeline 
+ HS2 deployment, story will be different.

I don't know what's the best, and all I'm saying is that we need to be very 
careful on what we doing. Before we decide what to do, we need to clearly 
define the problem we are trying to solve first.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35107/#review87090
---


On June 5, 2015, 2:09 a.m., cheng xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35107/
> ---
> 
> (Updated June 5, 2015, 2:09 a.m.)
> 
> 
> Review request for hive, chinna and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6791
> https://issues.apache.org/jira/browse/HIVE-6791
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Summary:
> 1) move the beeline-cli convertor to the place where cli is executed(class 
> **Commands**)
> 2) support substitution for source command
> 3) add some unit test for substitution
> 4) add one way to get the configuration from HS2
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 45a7e87 
>   beeline/src/java/org/apache/hive/beeline/BeelineVariableSubstitution.java 
> PRE-CREATION 
>   beeline/src/java/org/apache/hive/beeline/Commands.java a42baa3 
>   beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java 6cbb030 
> 
> Diff: https://reviews.apache.org/r/35107/diff/
> 
> 
> Testing
> ---
> 
> Unit test passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>



Re: Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command

2015-06-09 Thread cheng xu


> On June 9, 2015, 12:41 p.m., Xuefu Zhang wrote:
> > Besides the two minor issues I found in the patch, I was wondering if the 
> > approach we are taking is the best. Variable substitution is a server (HS2) 
> > behavior, and on this ground I think this should happen in HS2 instead of 
> > beeline. Please note that JDBC client may also submit queries with $var in 
> > it, and such a case should be also supported.
> > 
> > I also noticed that in Driver class, there is code handling variable 
> > substitution. I'm wondering why it's not effective.
> > 
> > Shell command (starting with !sh) is executed in the client (Beeline). I 
> > think we are fine if variable substituion doesn't work for shell command. 
> > We can address that as a followup taks if desirable.

Thanks for your comments. 

`I also noticed that in Driver class, there is code handling variable 
substitution. I'm wondering why it's not effective.`

The substitution works well in HS2 currently.
There are two reasons for me to add API getting the conf from HS2. One is to 
support substitution in sh and source command. In the old cli, source command 
and sh command worked well with substitution. So this part of this patch is 
addressing this purpose. Another consideration is for 
https://issues.apache.org/jira/browse/HIVE-10847 which required some 
configuration from hive-site.xml.


- cheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35107/#review87090
---


On June 5, 2015, 10:09 a.m., cheng xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35107/
> ---
> 
> (Updated June 5, 2015, 10:09 a.m.)
> 
> 
> Review request for hive, chinna and Xuefu Zhang.
> 
> 
> Bugs: HIVE-6791
> https://issues.apache.org/jira/browse/HIVE-6791
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Summary:
> 1) move the beeline-cli convertor to the place where cli is executed(class 
> **Commands**)
> 2) support substitution for source command
> 3) add some unit test for substitution
> 4) add one way to get the configuration from HS2
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 45a7e87 
>   beeline/src/java/org/apache/hive/beeline/BeelineVariableSubstitution.java 
> PRE-CREATION 
>   beeline/src/java/org/apache/hive/beeline/Commands.java a42baa3 
>   beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java 6cbb030 
> 
> Diff: https://reviews.apache.org/r/35107/diff/
> 
> 
> Testing
> ---
> 
> Unit test passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>



[jira] [Created] (HIVE-10974) Use Cofiguration::getRaw() for the Base64 data

2015-06-09 Thread Gopal V (JIRA)
Gopal V created HIVE-10974:
--

 Summary: Use Cofiguration::getRaw() for the Base64 data
 Key: HIVE-10974
 URL: https://issues.apache.org/jira/browse/HIVE-10974
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gopal V


Inspired by the Twitter HadoopSummit talk

{code}
   if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) {
  LOG.debug("Loading plan from string: "+path.toUri().getPath());
  String planString = conf.get(path.toUri().getPath());
{code}

Use getRaw() in other places where Base64 data is present.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10973) JvmPauseMonitor.incrementMetricsCounter NPE while starting HiveServer2

2015-06-09 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-10973:


 Summary: JvmPauseMonitor.incrementMetricsCounter NPE while 
starting HiveServer2
 Key: HIVE-10973
 URL: https://issues.apache.org/jira/browse/HIVE-10973
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan


I build and start HS2 in http mode as follows:
 ./hive --service hiveserver2 --hiveconf hive.server2.transport.mode=http 
--hiveconf hive.root.logger=DEBUG,console --hiveconf 
hive.server2.thrift.http.path=cliservice --hiveconf 
hive.server2.thrift.port=10001

I am  hitting a Null Pointer Exception around line 203 as follows:
{code}
15/06/09 13:46:01 
[org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor@5d648bfd]: WARN 
common.JvmPauseMonitor: Error Reporting JvmPauseMonitor to Metrics system
java.lang.NullPointerException
at 
org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.incrementMetricsCounter(JvmPauseMonitor.java:203)
at 
org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:195)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35256: HIVE-10956 HS2 leaks HMS connections

2015-06-09 Thread Jimmy Xiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35256/
---

(Updated June 9, 2015, 8:54 p.m.)


Review request for hive and Xuefu Zhang.


Repository: hive-git


Description
---

Share the HMS connection per HS2 session, and close it when the session is 
closed.


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
f5816a0 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8c948a9 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
33ee16b 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java 65f9b29 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
343c68e 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 a29e5d1 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
5a0f1c8 

Diff: https://reviews.apache.org/r/35256/diff/


Testing
---

Live cluster


Thanks,

Jimmy Xiang



Re: Review Request 35256: HIVE-10956 HS2 leaks HMS connections

2015-06-09 Thread Jimmy Xiang


> On June 9, 2015, 8:52 p.m., Xuefu Zhang wrote:
> > service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java, 
> > line 596
> > 
> >
> > If this requires synchronization, I guess the whole method will need to 
> > be synchronized. I can see the code above accessing shared member variables.

Sure. Will synchronize the whole method, to be safe.


- Jimmy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35256/#review87291
---


On June 9, 2015, 5 p.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35256/
> ---
> 
> (Updated June 9, 2015, 5 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Share the HMS connection per HS2 session, and close it when the session is 
> closed.
> 
> 
> Diffs
> -
> 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> f5816a0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8c948a9 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> 33ee16b 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 65f9b29 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 343c68e 
>   
> service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
>  a29e5d1 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> 5a0f1c8 
> 
> Diff: https://reviews.apache.org/r/35256/diff/
> 
> 
> Testing
> ---
> 
> Live cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



Re: Review Request 35256: HIVE-10956 HS2 leaks HMS connections

2015-06-09 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35256/#review87291
---



service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java


If this requires synchronization, I guess the whole method will need to be 
synchronized. I can see the code above accessing shared member variables.


- Xuefu Zhang


On June 9, 2015, 5 p.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35256/
> ---
> 
> (Updated June 9, 2015, 5 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Share the HMS connection per HS2 session, and close it when the session is 
> closed.
> 
> 
> Diffs
> -
> 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> f5816a0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8c948a9 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> 33ee16b 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 65f9b29 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 343c68e 
>   
> service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
>  a29e5d1 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> 5a0f1c8 
> 
> Diff: https://reviews.apache.org/r/35256/diff/
> 
> 
> Testing
> ---
> 
> Live cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



[jira] [Created] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.

2015-06-09 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-10972:
---

 Summary: DummyTxnManager always locks the current database in 
shared mode, which is incorrect.
 Key: HIVE-10972
 URL: https://issues.apache.org/jira/browse/HIVE-10972
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


In DummyTxnManager [line 163 | 
http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163],
 it always locks the current database. 

That is not correct since the current database can be "db1", and the query can 
be "select * from db2.tb1", which will lock db1 unnessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 35256: HIVE-10956 HS2 leaks HMS connections

2015-06-09 Thread Jimmy Xiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35256/
---

Review request for hive and Xuefu Zhang.


Repository: hive-git


Description
---

Share the HMS connection per HS2 session, and close it when the session is 
closed.


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
f5816a0 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8c948a9 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
33ee16b 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java 65f9b29 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
343c68e 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 a29e5d1 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
5a0f1c8 

Diff: https://reviews.apache.org/r/35256/diff/


Testing
---

Live cluster


Thanks,

Jimmy Xiang



Re: Review Request 35218: HIVE-10963 Hive throws NPE rather than meaningful error message when window is missing

2015-06-09 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35218/
---

(Updated June 9, 2015, 2:40 p.m.)


Review request for hive.


Repository: hive-git


Description
---

HIVE-10963 Hive throws NPE rather than meaningful error message when window is 
missing


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 48e3cc7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFInvocationSpec.java 06d3f4b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingSpec.java 953f3ae 
  ql/src/test/queries/clientnegative/ptf_negative_NoWindowDefn.q PRE-CREATION 
  ql/src/test/results/clientnegative/ptf_negative_NoWindowDefn.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/35218/diff/


Testing
---


Thanks,

Aihua Xu



Re: Review Request 35181: HIVE-10944 : Fix HS2 for Metrics

2015-06-09 Thread Lenni Kuff

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35181/#review87169
---

Ship it!


- Lenni Kuff


On June 7, 2015, 11:47 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35181/
> ---
> 
> (Updated June 7, 2015, 11:47 p.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Eliminated the redundant conf checks and eliminate synchronization in the 
> code path, by making the static Metrics instance as a static volatile 
> variable.  Achieved this by removing the Metrics init() method and moved 
> directly to the constructor.
> 
> Left some of the synchronization in the old LegacyMetrics the same.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java c3949f2 
>   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
> 14f7afb 
>   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
> 13a5336 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
>  12a309d 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
>  e59da99 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
> c14c7ee 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
>  8749349 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 85a734c 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 7820ed5 
> 
> Diff: https://reviews.apache.org/r/35181/diff/
> 
> 
> Testing
> ---
> 
> Ran affected tests, ran HS2 with and without metrics enabled.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>



Re: Review Request 35181: HIVE-10944 : Fix HS2 for Metrics

2015-06-09 Thread Lenni Kuff

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35181/#review87168
---



service/src/java/org/apache/hive/service/server/HiveServer2.java


nit: do you need the getInstance() == null check here? It seems like 
MetricsFactory.init() handles this.


- Lenni Kuff


On June 7, 2015, 11:47 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35181/
> ---
> 
> (Updated June 7, 2015, 11:47 p.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Eliminated the redundant conf checks and eliminate synchronization in the 
> code path, by making the static Metrics instance as a static volatile 
> variable.  Achieved this by removing the Metrics init() method and moved 
> directly to the constructor.
> 
> Left some of the synchronization in the old LegacyMetrics the same.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java c3949f2 
>   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
> 14f7afb 
>   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
> 13a5336 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
>  12a309d 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
>  e59da99 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
> c14c7ee 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
>  8749349 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 85a734c 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 7820ed5 
> 
> Diff: https://reviews.apache.org/r/35181/diff/
> 
> 
> Testing
> ---
> 
> Ran affected tests, ran HS2 with and without metrics enabled.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>



[jira] [Created] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true

2015-06-09 Thread wangmeng (JIRA)
wangmeng created HIVE-10971:
---

 Summary: count(*) with count(distinct) gives wrong results when 
hive.groupby.skewindata=true
 Key: HIVE-10971
 URL: https://issues.apache.org/jira/browse/HIVE-10971
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: wangmeng
Assignee: wangmeng


When hive.groupby.skewindata=true, the following query based on TPC-H gives 
wrong results:

{code}
set hive.groupby.skewindata=true;

select l_returnflag, count(*), count(distinct l_linestatus)
from lineitem
group by l_returnflag
limit 10;
{code}

The query plan shows that it generates only one MapReduce job instead of two, 
which is dictated by hive.groupby.skewindata=true.

The problem arises only when {noformat}count(*){noformat} and 
{noformat}count(distinct){noformat} exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)