Re: Review Request 34776: HIVE-4239 : Remove lock on compilation stage

2015-06-04 Thread Thejas Nair


 On June 4, 2015, 10:31 p.m., Thejas Nair wrote:
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java,
   line 55
  https://reviews.apache.org/r/34776/diff/5/?file=976068#file976068line55
 
  HiveSessionProxy is the one that does the doAs().
  
  I looked some more, it looks like this should not cause a problem with 
  the safeguards added in HIVE-7890. (but I think we should treat that change 
  as a safeguard, as I am not sure of the performance implications of that 
  additional check every time).
  
  However, with this approach, we would end up creating a new Hive object 
  in most cases, if there are large number of users. Also the checks done in 
  Hive.get(conf); to see if new object needs to be created is not trivial.
  
  The cases were sessions are used in parallel is rare, so paying this 
  performance penalty for the more common case is not really justified IMO.
  
  I think we should just use locks around metastore client class for 
  that. I think thats something we should pursue in a different jira.
 
 Sergey Shelukhin wrote:
 does it require anything in this JIRA?

yes, remove the changes to HiveSessionImplwithUGI.java from this jira.


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34776/#review86727
---


On June 2, 2015, 7:07 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34776/
 ---
 
 (Updated June 2, 2015, 7:07 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d733d71 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 5dac29f 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  5d01311 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 0edfc5d 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 56707af 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 37b6d6f 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 343c68e 
   
 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
  a29e5d1 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java b4d517f 
 
 Diff: https://reviews.apache.org/r/34776/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Created] (HIVE-10939) Make TestFileDump robust

2015-06-04 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-10939:
---

 Summary: Make TestFileDump robust
 Key: HIVE-10939
 URL: https://issues.apache.org/jira/browse/HIVE-10939
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


It fails on Windows OS currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command

2015-06-04 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35107/
---

(Updated June 5, 2015, 10:09 a.m.)


Review request for hive, chinna and Xuefu Zhang.


Bugs: HIVE-6791
https://issues.apache.org/jira/browse/HIVE-6791


Repository: hive-git


Description
---

Summary:
1) move the beeline-cli convertor to the place where cli is executed(class 
**Commands**)
2) support substitution for source command
3) add some unit test for substitution
4) add one way to get the configuration from HS2


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 45a7e87 
  beeline/src/java/org/apache/hive/beeline/BeelineVariableSubstitution.java 
PRE-CREATION 
  beeline/src/java/org/apache/hive/beeline/Commands.java a42baa3 
  beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java 6cbb030 

Diff: https://reviews.apache.org/r/35107/diff/


Testing
---

Unit test passed


Thanks,

cheng xu



[jira] [Created] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch

2015-06-04 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-10943:
---

 Summary: Beeline-cli: Enable precommit for beelie-cli branch 
 Key: HIVE-10943
 URL: https://issues.apache.org/jira/browse/HIVE-10943
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor


NO PRECOMMIT TESTS




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-04 Thread Gopal V (JIRA)
Gopal V created HIVE-10940:
--

 Summary: HiveInputFormat::pushFilters serializes PPD objects for 
each getRecordReader call
 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Sergey Shelukhin


{code}
String filterText = filterExpr.getExprString();
String filterExprSerialized = Utilities.serializeExpression(filterExpr);
{code}

the serializeExpression initializes Kryo and produces a new packed object for 
every split.

HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.

And Kryo is very slow to do this for a large filter clause.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-04 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-10941:


 Summary: Provide option to disable spark tests outside itests
 Key: HIVE-10941
 URL: https://issues.apache.org/jira/browse/HIVE-10941
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


HIVE-10477 provided an option to disable spark module, however we missed the 
following files that are outside itests directory. i.e we need to club the 
option with disabling the following tests as well :
{code}
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
The above tests need to be disabled.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10942) LLAP: expose what's running on the daemon thru JMX

2015-06-04 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-10942:
---

 Summary: LLAP: expose what's running on the daemon thru JMX
 Key: HIVE-10942
 URL: https://issues.apache.org/jira/browse/HIVE-10942
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command

2015-06-04 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35107/
---

Review request for hive, chinna and Xuefu Zhang.


Bugs: HIVE-6791
https://issues.apache.org/jira/browse/HIVE-6791


Repository: hive-git


Description
---

Summary:
1) move the beeline-cli convertor to the place where cli is executed(class 
**Commands**)
2) support substitution for source command
3) add some unit test for substitution
4) add one way to get the configuration from HS2


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 45a7e87 
  beeline/src/java/org/apache/hive/beeline/BeelineVariableSubstitution.java 
PRE-CREATION 
  beeline/src/java/org/apache/hive/beeline/Commands.java a42baa3 
  beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java 6cbb030 

Diff: https://reviews.apache.org/r/35107/diff/


Testing
---

Unit test passed


Thanks,

cheng xu



[jira] [Created] (HIVE-10938) All the analyze table statements are failing on encryption testing framework

2015-06-04 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-10938:
--

 Summary: All the analyze table statements are failing on 
encryption testing framework
 Key: HIVE-10938
 URL: https://issues.apache.org/jira/browse/HIVE-10938
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong


To reproduce, in recent q test environment, create a q file
{code}
drop table IF EXISTS unencryptedTable;
create table unencryptedTable(key string, value string);
insert into table unencryptedTable values
('501', 'val_501'),
('502', 'val_502');
analyze table unencryptedTable compute statistics;
explain select * from unencryptedTable;
{code}
Then run with TestEncryptedHDFSCliDriver.
analyze table will generate a MapRed task and a StatsTask. The MapRed task will 
fail silently without generating the stats, e.g., numRows for the table. And 
the following StatsTask can not read any results. This will fail not only for 
encrypted tables but also non-encrypted one as shown above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10936) incorrect result set when hive.vectorized.execution.enabled = true with predicate casting to CHAR or VARCHAR

2015-06-04 Thread N Campbell (JIRA)
N Campbell created HIVE-10936:
-

 Summary: incorrect result set when 
hive.vectorized.execution.enabled = true with predicate casting to CHAR or 
VARCHAR
 Key: HIVE-10936
 URL: https://issues.apache.org/jira/browse/HIVE-10936
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 0.14.0
 Environment: In this case using HDP install of Hive - 0.14.0.2.2.4.2-2


Reporter: N Campbell


Query returns data when set hive.vectorized.execution.enabled = false -or- if 
target of CAST is STRING and not CHAR/VARCHAR

set hive.vectorized.execution.enabled = true;

select 
`GO_TIME_DIM`.`day_key`
from 
`gosalesdw1021`.`go_time_dim` `GO_TIME_DIM` 
where 
CAST(`GO_TIME_DIM`.`current_year` AS CHAR(4)) = '2010' 
group by 
`GO_TIME_DIM`.`day_key`;


create table GO_TIME_DIM ( DAY_KEY int , DAY_DATE timestamp , MONTH_KEY int , 
CURRENT_MONTH smallint , MONTH_NUMBER int , QUARTER_KEY int , CURRENT_QUARTER 
smallint , CURRENT_YEAR smallint , DAY_OF_WEEK smallint , DAY_OF_MONTH smallint 
, DAYS_IN_MONTH smallint , DAY_OF_YEAR smallint , WEEK_OF_MONTH smallint , 
WEEK_OF_QUARTER smallint , WEEK_OF_YEAR smallint , MONTH_EN string , WEEKDAY_EN 
string , MONTH_DE string , WEEKDAY_DE string , MONTH_FR string , WEEKDAY_FR 
string , MONTH_JA string , WEEKDAY_JA string , MONTH_AR string , WEEKDAY_AR 
string , MONTH_CS string , WEEKDAY_CS string , MONTH_DA string , WEEKDAY_DA 
string , MONTH_EL string , WEEKDAY_EL string , MONTH_ES string , WEEKDAY_ES 
string , MONTH_FI string , WEEKDAY_FI string , MONTH_HR string , WEEKDAY_HR 
string , MONTH_HU string , WEEKDAY_HU string , MONTH_ID string , WEEKDAY_ID 
string , MONTH_IT string , WEEKDAY_IT string , MONTH_KK string , WEEKDAY_KK 
string , MONTH_KO string , WEEKDAY_KO string , MONTH_MS string , WEEKDAY_MS 
string , MONTH_NL string , WEEKDAY_NL string , MONTH_NO string , WEEKDAY_NO 
string , MONTH_PL string , WEEKDAY_PL string , MONTH_PT string , WEEKDAY_PT 
string , MONTH_RO string , WEEKDAY_RO string , MONTH_RU string , WEEKDAY_RU 
string , MONTH_SC string , WEEKDAY_SC string , MONTH_SL string , WEEKDAY_SL 
string , MONTH_SV string , WEEKDAY_SV string , MONTH_TC string , WEEKDAY_TC 
string , MONTH_TH string , WEEKDAY_TH string , MONTH_TR string , WEEKDAY_TR 
string )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
 STORED AS TEXTFILE
LOCATION '../GO_TIME_DIM';

Then create an ORC equivalent table and load it

insert overwrite table 
GO_TIME_DIM
select * from TEXT.GO_TIME_DIM
;




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon

2015-06-04 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-10937:
---

 Summary: LLAP: make ObjectCache for plans work properly in the 
daemon
 Key: HIVE-10937
 URL: https://issues.apache.org/jira/browse/HIVE-10937
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of 
4Mb each.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34776: HIVE-4239 : Remove lock on compilation stage

2015-06-04 Thread Sergey Shelukhin


 On June 4, 2015, 10:31 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java, line 184
  https://reviews.apache.org/r/34776/diff/5-6/?file=976065#file976065line184
 
  shoudl this also be static ?

no, see class comment - tests override it


 On June 4, 2015, 10:31 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java, line 209
  https://reviews.apache.org/r/34776/diff/5-6/?file=976065#file976065line209
 
  also make static ?

same


 On June 4, 2015, 10:31 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java, line 321
  https://reviews.apache.org/r/34776/diff/6/?file=976761#file976761line321
 
  can't everything in the GenTezUtils be made static and then we don't 
  need to create any instances of it ?
  Maybe make the constructor of GenTezUtils private.

see above?


 On June 4, 2015, 10:31 p.m., Thejas Nair wrote:
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java,
   line 55
  https://reviews.apache.org/r/34776/diff/5/?file=976068#file976068line55
 
  HiveSessionProxy is the one that does the doAs().
  
  I looked some more, it looks like this should not cause a problem with 
  the safeguards added in HIVE-7890. (but I think we should treat that change 
  as a safeguard, as I am not sure of the performance implications of that 
  additional check every time).
  
  However, with this approach, we would end up creating a new Hive object 
  in most cases, if there are large number of users. Also the checks done in 
  Hive.get(conf); to see if new object needs to be created is not trivial.
  
  The cases were sessions are used in parallel is rare, so paying this 
  performance penalty for the more common case is not really justified IMO.
  
  I think we should just use locks around metastore client class for 
  that. I think thats something we should pursue in a different jira.

does it require anything in this JIRA?


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34776/#review86727
---


On June 2, 2015, 7:07 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34776/
 ---
 
 (Updated June 2, 2015, 7:07 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d733d71 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 5dac29f 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  5d01311 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 0edfc5d 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 56707af 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 37b6d6f 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 343c68e 
   
 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
  a29e5d1 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java b4d517f 
 
 Diff: https://reviews.apache.org/r/34776/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




IMPORTANT - fix version

2015-06-04 Thread Thejas Nair
As branch-1 has been created, if the patch is committed to master ,
please set the fix version to 2.0.0. If the patch is also committed to
branch-1, please add 1.3.0 as the fix version.

For now, I will do a bulk update to add 2.0.0 to any that has 1.3.0 as
the fix version.  I am assuming that so far, anything that has gone
into 1.3.0 (branch-1) has also gone in to 2.0.0 (master).

If something has gone into master only and has 1.3.0 as fix version,
we need to correct that (either by 'git cherry-pick' it into branch-1
or removing 1.3.0 from fix version.

Thanks,
Thejas


Re: Review Request 34776: HIVE-4239 : Remove lock on compilation stage

2015-06-04 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34776/#review86727
---



ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java
https://reviews.apache.org/r/34776/#comment138788

shoudl this also be static ?



ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java
https://reviews.apache.org/r/34776/#comment138789

also make static ?



service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
https://reviews.apache.org/r/34776/#comment138791

HiveSessionProxy is the one that does the doAs().

I looked some more, it looks like this should not cause a problem with the 
safeguards added in HIVE-7890. (but I think we should treat that change as a 
safeguard, as I am not sure of the performance implications of that additional 
check every time).

However, with this approach, we would end up creating a new Hive object in 
most cases, if there are large number of users. Also the checks done in 
Hive.get(conf); to see if new object needs to be created is not trivial.

The cases were sessions are used in parallel is rare, so paying this 
performance penalty for the more common case is not really justified IMO.

I think we should just use locks around metastore client class for that. I 
think thats something we should pursue in a different jira.



ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java
https://reviews.apache.org/r/34776/#comment138792

Thanks for cleaning up the utils and threadlocal use!



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
https://reviews.apache.org/r/34776/#comment138790

can't everything in the GenTezUtils be made static and then we don't need 
to create any instances of it ?
Maybe make the constructor of GenTezUtils private.


- Thejas Nair


On June 2, 2015, 7:07 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34776/
 ---
 
 (Updated June 2, 2015, 7:07 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d733d71 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 5dac29f 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  5d01311 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 0edfc5d 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 56707af 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 37b6d6f 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 343c68e 
   
 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
  a29e5d1 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java b4d517f 
 
 Diff: https://reviews.apache.org/r/34776/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




Re: Review Request 34393: HIVE-10427 - collect_list() and collect_set() should accept struct types as argument

2015-06-04 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34393/#review86692
---

Ship it!


Ship It!

- Alexander Pivovarov


On June 1, 2015, 4:19 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34393/
 ---
 
 (Updated June 1, 2015, 4:19 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-10427
 https://issues.apache.org/jira/browse/HIVE-10427
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Currently for collect_list() and collect_set(), only primitive types are 
 supported. This patch adds support for struct, list and map types as well.
 
 It turned out I that all I need is loosen the type checking.
 
 
 Diffs
 -
 
   data/files/customers.txt PRE-CREATION 
   data/files/nested_orders.txt PRE-CREATION 
   data/files/orders.txt PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectList.java 
 536c4a7 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java 
 6dc424a 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java
  efcc8f5 
   ql/src/test/queries/clientnegative/udaf_collect_set_unsupported.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/udaf_collect_set_2.q PRE-CREATION 
   ql/src/test/results/clientnegative/udaf_collect_set_unsupported.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/udaf_collect_set_2.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34393/diff/
 
 
 Testing
 ---
 
 All but one test (which seems unrelated) are passing.
 I also added a test: udaf_collect_list_set_2.q
 
 
 Thanks,
 
 Chao Sun
 




[jira] [Created] (HIVE-10935) LLAP: merge master to branch

2015-06-04 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-10935:
---

 Summary: LLAP: merge master to branch
 Key: HIVE-10935
 URL: https://issues.apache.org/jira/browse/HIVE-10935
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()

2015-06-04 Thread Son Nguyen (JIRA)
Son Nguyen created HIVE-10933:
-

 Summary: Hive 0.13 returns precision 0 for varchar(32) from 
DatabaseMetadata.getColumns()
 Key: HIVE-10933
 URL: https://issues.apache.org/jira/browse/HIVE-10933
 Project: Hive
  Issue Type: Bug
  Components: API
Affects Versions: 0.13.0
Reporter: Son Nguyen


DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined 
as varchar(32), or char(32).   While ResultSetMetaData.getPrecision() returns 
correct value 32.

Here is the segment program that reproduces the issue.

try {
statement = connection.createStatement();

statement.execute(drop table if exists son_table);

statement.execute(create table son_table( col1 varchar(32) ));

statement.close();

} catch ( Exception e) {
   return;
}   

// get column info using metadata
try {
DatabaseMetaData dmd = null;
ResultSet resultSet = null;

dmd = connection.getMetaData();

resultSet = dmd.getColumns(null, null, son_table, col1);

if ( resultSet.next() ) {
String tabName = resultSet.getString(TABLE_NAME);
String colName = resultSet.getString(COLUMN_NAME);
String dataType = resultSet.getString(DATA_TYPE);
String typeName = resultSet.getString(TYPE_NAME);
int precision = resultSet.getInt(COLUMN_SIZE);

// output is: colName = col1, dataType = 12, typeName = 
VARCHAR, precision = 0.
  System.out.format(colName = %s, dataType = %s, typeName = %s, 
precision = %d.,
colName, dataType, typeName, precision);
}

} catch ( Exception e) {
return;
}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10927) Add number of HMS connection metrics

2015-06-04 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-10927:


 Summary: Add number of HMS connection metrics
 Key: HIVE-10927
 URL: https://issues.apache.org/jira/browse/HIVE-10927
 Project: Hive
  Issue Type: Sub-task
  Components: Diagnosability
Reporter: Szehon Ho






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10926) Create Json-servlet reporter for metrics

2015-06-04 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-10926:


 Summary: Create Json-servlet reporter for metrics
 Key: HIVE-10926
 URL: https://issues.apache.org/jira/browse/HIVE-10926
 Project: Hive
  Issue Type: Sub-task
Reporter: Szehon Ho


Codahale comes with a json-servlet reporter, which would be easiest to consume 
for monitoring tools vs other formats.

We could leverage this by creating a http server for HS2 and HMS and adding 
this servlet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10928) Concurrent Beeline Connections can not work different databases

2015-06-04 Thread chirag aggarwal (JIRA)
chirag aggarwal created HIVE-10928:
--

 Summary: Concurrent Beeline Connections can not work different 
databases
 Key: HIVE-10928
 URL: https://issues.apache.org/jira/browse/HIVE-10928
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 0.14.0
Reporter: chirag aggarwal


The concurrent beeline connections are not able to work on different databases. 
If one connection calls 'use abc', then all the connections start working on 
database 'abc'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10931) Wrong columns selected on multiple joins

2015-06-04 Thread Furcy Pin (JIRA)
Furcy Pin created HIVE-10931:


 Summary: Wrong columns selected on multiple joins
 Key: HIVE-10931
 URL: https://issues.apache.org/jira/browse/HIVE-10931
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
 Environment: Cloudera cdh5.4.2
Reporter: Furcy Pin


The following set of queries :

```
DROP TABLE IF EXISTS test1 ;
DROP TABLE IF EXISTS test2 ;
DROP TABLE IF EXISTS test3 ;

CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 
STRING, col6 STRING) ;
INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,A) ;

CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 
STRING, col6 STRING) ;
INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,X) ;

CREATE TABLE test3 (coL1 STRING) ;
INSERT INTO TABLE test3 VALUES (A) ;

SELECT
  T2.val
FROM test1 T1
LEFT JOIN (SELECT col1, col2, col3, col4, col5,  COALESCE(col6,) as val FROM 
test2) T2
ON T2.col1 = T1.col1
LEFT JOIN test3 T3  
ON T3.col1 = T1.col6 
;
```

will return this :

```
+--+--+
| t2.val   |
+--+--+
| A|
+--+--+
```

Obviously, this result is wrong as table `test2` contains a X and no A.

This is the most minimal example we found of this issue, in particular
having less than 6 columns in the tables will work, for instance :

```
SELECT
  T2.val
FROM test1 T1
LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,) as val FROM test2) 
T2
ON T2.col1 = T1.col1
LEFT JOIN test3 T3  
ON T3.col1 = T1.col6 
;
```

(same query as before, but `col5` was removed from the select)
will return :

```
+--+--+
| t2.val   |
+--+--+
| X|
+--+--+
```

Removing the `COALESCE` also removes the bug...






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values

2015-06-04 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-10929:
-

 Summary: In Tez mode,dynamic partitioning query with union all 
fails at moveTask,Invalid partition key  values
 Key: HIVE-10929
 URL: https://issues.apache.org/jira/browse/HIVE-10929
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K


{code}
create table dummy(i int);
insert into table dummy values (1);
select * from dummy;

create table partunion1(id1 int) partitioned by (part1 string);

set hive.exec.dynamic.partition.mode=nonstrict;
set hive.execution.engine=tez;

explain insert into table partunion1 partition(part1)
select temps.* from (
select 1 as id1, '2014' as part1 from dummy 
union all 
select 2 as id1, '2014' as part1 from dummy ) temps;

insert into table partunion1 partition(part1)
select temps.* from (
select 1 as id1, '2014' as part1 from dummy 
union all 
select 2 as id1, '2014' as part1 from dummy ) temps;

select * from partunion1;
{code}

fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10930) LLAP: Set java.io.tmpdir correctly for LLAP Slider instance

2015-06-04 Thread Gopal V (JIRA)
Gopal V created HIVE-10930:
--

 Summary: LLAP: Set java.io.tmpdir correctly for LLAP Slider 
instance
 Key: HIVE-10930
 URL: https://issues.apache.org/jira/browse/HIVE-10930
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gopal V
Assignee: Gopal V
 Fix For: llap


LLAP's Hybrid Grace Hash is IO bound writing to /tmp.

Use the yarn local dirs instead of /tmp so that createTempFile works 
correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: HIVE-TRUNK-JAVA8 #77

2015-06-04 Thread hiveqa
See 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-JAVA8/77/

--
Started by timer
Building in workspace 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-JAVA8/ws/
  git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
  git config remote.origin.url 
  https://git-wip-us.apache.org/repos/asf/hive.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/hive.git
  git --version # timeout=10
  git fetch --tags --progress https://git-wip-us.apache.org/repos/asf/hive.git 
  +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
ERROR: Error fetching remote repo 'origin'
Archiving artifacts
Recording test results


[jira] [Created] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-06-04 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-10934:
-

 Summary: Restore support for DROP PARTITION PURGE
 Key: HIVE-10934
 URL: https://issues.apache.org/jira/browse/HIVE-10934
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


HIVE-9086 added support for PURGE in 
{noformat}
ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) 
IGNORE PROTECTION PURGE;
{noformat}
looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728

2015-06-04 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-10932:
---

 Summary: Unit test udf_nondeterministic failure due to HIVE-10728
 Key: HIVE-10932
 URL: https://issues.apache.org/jira/browse/HIVE-10932
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Aihua Xu
Assignee: Aihua Xu


The test udf_nondeterministic.q failed due to the change in HIVE-10728, in 
which unix_timestamp() is now marked as deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)