[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2015-07-22 Thread Olaf Flebbe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638331#comment-14638331
 ] 

Olaf Flebbe commented on HIVE-9941:
---

Just verified it happens on 1.2.0 too

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2015-07-22 Thread Olaf Flebbe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olaf Flebbe updated HIVE-9941:
--
Affects Version/s: (was: 0.14.0)
   1.0.0
   1.2.0

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11254) Process result sets returned by a stored procedure

2015-07-22 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638317#comment-14638317
 ] 

Dmitry Tolpeko commented on HIVE-11254:
---

Yes, and it is documented at http://www.plhql.org/allocate-cursor. Sorry I will 
start porting it to Apache Confluence soon.

> Process result sets returned by a stored procedure
> --
>
> Key: HIVE-11254
> URL: https://issues.apache.org/jira/browse/HIVE-11254
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Fix For: 2.0.0
>
> Attachments: HIVE-11254.1.patch, HIVE-11254.2.patch, 
> HIVE-11254.3.patch, HIVE-11254.4.patch
>
>
> Stored procedure can return one or more result sets. A caller should be able 
> to process them.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11347) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638316#comment-14638316
 ] 

Hive QA commented on HIVE-11347:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746661/HIVE-11347.01.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9256 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4699/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4699/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4699/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746661 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS
> --
>
> Key: HIVE-11347
> URL: https://issues.apache.org/jira/browse/HIVE-11347
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11347.01.patch
>
>
> need to add a project on the final project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-07-22 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638310#comment-14638310
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

Not yet, sorry. This functionality needs support from Hive core.

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Fix For: 2.0.0
>
> Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch, 
> HIVE-11055.3.patch, HIVE-11055.4.patch, hplsql-site.xml
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11335) Multi-Join Inner Query producing incorrect results

2015-07-22 Thread fatkun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638229#comment-14638229
 ] 

fatkun commented on HIVE-11335:
---

thanks, I test the patch in 1.1.0, It's OK now.

> Multi-Join Inner Query producing incorrect results
> --
>
> Key: HIVE-11335
> URL: https://issues.apache.org/jira/browse/HIVE-11335
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Jesus Camacho Rodriguez
> Attachments: query1.txt, query2.txt
>
>
> test step
> {code}
> create table log (uid string, uid2 string);
> insert into log values ('1', '1');
> create table user (uid string, name string);
> insert into user values ('1', "test1");
> {code}
> (Query1)
> {code}
> select b.name, c.name from log a
>  left outer join (select uid, name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid2=c.uid);
> {code}
> return wrong result:
> 1 test1
> It should be both return test1
> (Query2)I try to find error, if I use this query, return right result.(join 
> key different)
> {code}
> select b.name, c.name from log a
>  left outer join (select uid, name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid=c.uid);
> {code}
> The explain is different,Query1 only select one colum. It should select uid 
> and name.
> {code}
> b:user 
>   TableScan
> alias: user
> Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column 
> stats: NONE
> Select Operator
>   expressions: uid (type: string)
>   outputColumnNames: _col0
> {code}
> It may relate HIVE-10996
> =UPDATE1===
> (Query3) this query return correct result
> {code}
> select b.name, c.name from log a
>  left outer join (select user.uid, user.name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid2=c.uid);
> {code}
> the operator tree
> TS[0]-SEL[1]-RS[5]-JOIN[6]-RS[7]-JOIN[9]-SEL[10]-FS[11]
> TS[2]-RS[4]-JOIN[6]
> TS[3]-RS[8]-JOIN[9]
> the Query1 SEL[1] rowSchema is wrong, cannot detect the tabAlias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11350) LLAP: Fix API usage to work with evolving Tez APIs - TEZ-2005

2015-07-22 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-11350:
--
Attachment: HIVE-11350.1.TEZ2005.txt

> LLAP: Fix API usage to work with evolving Tez APIs - TEZ-2005
> -
>
> Key: HIVE-11350
> URL: https://issues.apache.org/jira/browse/HIVE-11350
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: llap
>
> Attachments: HIVE-11350.1.TEZ2005.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are unusable after it

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638172#comment-14638172
 ] 

Hive QA commented on HIVE-11344:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746652/HIVE-11344.patch

{color:green}SUCCESS:{color} +1 9257 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4698/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4698/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4698/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746652 - PreCommit-HIVE-TRUNK-Build

> HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are 
> unusable after it
> ---
>
> Key: HIVE-11344
> URL: https://issues.apache.org/jira/browse/HIVE-11344
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-11344.patch
>
>
> HIVE-9845 introduced a notion of compression for HCatSplits so that when 
> serializing, it finds commonalities between PartInfo and TableInfo objects, 
> and if the two are identical, it nulls out that field in PartInfo, thus 
> making sure that when PartInfo is then serialized, info is not repeated.
> This, however, has the side effect of making the PartInfo object unusable if 
> HCatSplit.write has been called.
> While this does not affect M/R directly, since they do not know about the 
> PartInfo objects and once serialized, the HCatSplit object is recreated by 
> deserializing on the backend, which does restore the split and its PartInfo 
> objects, this does, however, affect framework users of HCat that try to mimic 
> M/R and then use the PartInfo objects to instantiate distinct readers.
> Thus, we need to make it so that PartInfo is still usable after 
> HCatSplit.write is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11294) Use HBase to cache aggregated stats

2015-07-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638169#comment-14638169
 ] 

Lefty Leverenz edited comment on HIVE-11294 at 7/23/15 5:14 AM:


Doc note:  This creates four configuration parameters, so they will need to be 
documented in the wiki after hbase-metastore-branch gets merged to trunk.  In 
the meantime, I'm linking this issue to HIVE-9752 (Documentation for HBase 
metastore).

The new parameters are:

*  hive.metastore.hbase.aggr.stats.cache.entries
*  hive.metastore.hbase.aggr.stats.memory.ttl
*  hive.metastore.hbase.aggr.stats.invalidator.frequency
*  hive.metastore.hbase.aggr.stats.hbase.ttl


was (Author: le...@hortonworks.com):
Doc note:  This creates four configuration parameters, so they will need to be 
documented in the wiki after hbase-metastore-branch gets merged to trunk.  In 
the meantime, I'm linking this issue to HIVE-9752 (Documentation for HBase 
metastore).

> Use HBase to cache aggregated stats
> ---
>
> Key: HIVE-11294
> URL: https://issues.apache.org/jira/browse/HIVE-11294
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11294.2.patch, HIVE-11294.patch
>
>
> Currently stats are cached only in the memory of the client.  Given that 
> HBase can easily manage the scale of caching aggregated stats we should be 
> using it to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched

2015-07-22 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638170#comment-14638170
 ] 

Ferdinand Xu commented on HIVE-10516:
-

Yes, exactly. And we can share some time to improve the performance.

> Measure Hive CLI's performance difference before and after implementation is 
> switched
> -
>
> Key: HIVE-10516
> URL: https://issues.apache.org/jira/browse/HIVE-10516
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-10516-beeline-cli.patch, HIVE-10516.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11294) Use HBase to cache aggregated stats

2015-07-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638169#comment-14638169
 ] 

Lefty Leverenz commented on HIVE-11294:
---

Doc note:  This creates four configuration parameters, so they will need to be 
documented in the wiki after hbase-metastore-branch gets merged to trunk.  In 
the meantime, I'm linking this issue to HIVE-9752 (Documentation for HBase 
metastore).

> Use HBase to cache aggregated stats
> ---
>
> Key: HIVE-11294
> URL: https://issues.apache.org/jira/browse/HIVE-11294
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11294.2.patch, HIVE-11294.patch
>
>
> Currently stats are cached only in the memory of the client.  Given that 
> HBase can easily manage the scale of caching aggregated stats we should be 
> using it to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function

2015-07-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638168#comment-14638168
 ] 

Ashutosh Chauhan commented on HIVE-11271:
-

Plan which is broken at compile time but is patched up at runtime to work 
correctly is a *bad* idea because the notion of brokeness is only between this 
piece of code and at runtime and is opaque to everything in between. So, any 
subsequent code which mutates the plan (e.g, logical optimizer rules or 
physical compiler (MR/Tez/Spark compiler)) has to accomodate for this special 
condition. 
In general, at any time plan should be fully self-describing and not rely on 
subsequent patching. 

> java.lang.IndexOutOfBoundsException when union all with if function
> ---
>
> Key: HIVE-11271
> URL: https://issues.apache.org/jira/browse/HIVE-11271
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11271.1.patch
>
>
> Some queries with Union all as subquery fail in MapReduce task with 
> stacktrace:
> {noformat}
> 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing 
> operator UNION[104]
> 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor 
> complete.
> 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: 
> job_local826862759_0005
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 10 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
>   ... 21 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apach

[jira] [Commented] (HIVE-11254) Process result sets returned by a stored procedure

2015-07-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638162#comment-14638162
 ] 

Lefty Leverenz commented on HIVE-11254:
---

Does this need documentation?

> Process result sets returned by a stored procedure
> --
>
> Key: HIVE-11254
> URL: https://issues.apache.org/jira/browse/HIVE-11254
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Fix For: 2.0.0
>
> Attachments: HIVE-11254.1.patch, HIVE-11254.2.patch, 
> HIVE-11254.3.patch, HIVE-11254.4.patch
>
>
> Stored procedure can return one or more result sets. A caller should be able 
> to process them.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched

2015-07-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638114#comment-14638114
 ] 

Xuefu Zhang commented on HIVE-10516:


So, embedded beeline is 85% slower than hive CLI?

> Measure Hive CLI's performance difference before and after implementation is 
> switched
> -
>
> Key: HIVE-10516
> URL: https://issues.apache.org/jira/browse/HIVE-10516
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-10516-beeline-cli.patch, HIVE-10516.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11321) Move OrcFile.OrcTableProperties from OrcFile into OrcConf.

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638106#comment-14638106
 ] 

Hive QA commented on HIVE-11321:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746435/HIVE-11321.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9253 tests executed
*Failed tests:*
{noformat}
TestSchedulerQueue - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4697/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4697/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4697/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746435 - PreCommit-HIVE-TRUNK-Build

> Move OrcFile.OrcTableProperties from OrcFile into OrcConf.
> --
>
> Key: HIVE-11321
> URL: https://issues.apache.org/jira/browse/HIVE-11321
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.0.0
>
> Attachments: HIVE-11321.patch
>
>
> We should pull all of the configuration/table property knobs into a single 
> list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched

2015-07-22 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638098#comment-14638098
 ] 

Ferdinand Xu commented on HIVE-10516:
-

Hi [~xuefuz], is this what you have in mind? 

> Measure Hive CLI's performance difference before and after implementation is 
> switched
> -
>
> Key: HIVE-10516
> URL: https://issues.apache.org/jira/browse/HIVE-10516
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-10516-beeline-cli.patch, HIVE-10516.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11336) Support initial file option for new CLI [beeline-cli branch]

2015-07-22 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11336:

Attachment: HIVE-11336-beeline-cli.1.patch

Thanks [~xuefuz] for your review. It's reasonable to have a space in the path 
especially for Windows, like "Personal Data". Update patch addressing this.

> Support initial file option for new CLI [beeline-cli branch]
> 
>
> Key: HIVE-11336
> URL: https://issues.apache.org/jira/browse/HIVE-11336
> Project: Hive
>  Issue Type: Sub-task
>  Components: Beeline
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11336-beeline-cli.1.patch, 
> HIVE-11336-beeline-cli.patch
>
>
> Option 'i' need to be enabled in the new CLI, which can support multiple 
> initial files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-07-22 Thread wangchangchun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638072#comment-14638072
 ] 

wangchangchun commented on HIVE-11055:
--

Hello, I want to ask a question.
Whether HPL/SQL support TRANSACTION ?
If support,  support which level,read uncommitted,read commit,repeatable read 
or serializable?
Can HPL/SQL support SAVEPOINT?

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Fix For: 2.0.0
>
> Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch, 
> HIVE-11055.3.patch, HIVE-11055.4.patch, hplsql-site.xml
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11333) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner prunes columns of UnionOperator that should be kept

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638059#comment-14638059
 ] 

Hive QA commented on HIVE-11333:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746636/HIVE-11333.02.patch

{color:green}SUCCESS:{color} +1 9257 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4696/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4696/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4696/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746636 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner 
> prunes columns of UnionOperator that should be kept
> --
>
> Key: HIVE-11333
> URL: https://issues.apache.org/jira/browse/HIVE-11333
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11333.01.patch, HIVE-11333.02.patch
>
>
> unionOperator will have the schema following the operator in the first 
> branch. Because ColumnPruner prunes columns based on the internal name, the 
> column in other branches may be pruned due to a different internal name from 
> the first branch. To repro, run rcfile_union.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


issues@hive.apache.org

2015-07-22 Thread xy7 (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638055#comment-14638055
 ] 

xy7 commented on HIVE-8339:
---

My cluster is hadoop 2.6.0, hive1.1.0, with the same bug, I tried recompile the 
code of 
https://github.com/radimk/hive/commit/bf4d047274fb3fddd9bcfe8432154cda222e6582 
and replace the output jar(hive-exec-1.1.0.jar) of hive, but this did not 
worked, i dont know why?

2015-07-23 02:08:04,992 Stage-1 map = 100%,  reduce = 98%, Cumulative CPU 
38797.19 sec
2015-07-23 02:08:11,090 Stage-1 map = 100%,  reduce = 99%, Cumulative CPU 
38804.08 sec
2015-07-23 02:08:21,319 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
38815.55 sec
java.io.IOException: Could not find status of job:job_1437009840203_3838
at 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:295)
at 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:557)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:434)
at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75)

> Job status not found after 100% succeded map&reduce
> ---
>
> Key: HIVE-8339
> URL: https://issues.apache.org/jira/browse/HIVE-8339
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
> Environment: Hadoop 2.4.0, Hive 0.13.1.
> Amazon EMR cluster of 9 i2.4xlarge nodes.
> 800+GB of data in HDFS.
>Reporter: Valera Chevtaev
>
> According to the logs it seems that the jobs 100% succeed for both map and 
> reduce but then wasn't able to get the status of the job from job history 
> server.
> Hive logs:
> 2014-10-03 07:57:26,593 INFO  [main]: exec.Task 
> (SessionState.java:printInfo(536)) - 2014-10-03 07:57:26,593 Stage-1 map = 
> 100%, reduce = 99%, Cumulative CPU 872541.02 sec
> 2014-10-03 07:57:47,447 INFO  [main]: exec.Task 
> (SessionState.java:printInfo(536)) - 2014-10-03 07:57:47,446 Stage-1 map = 
> 100%, reduce = 100%, Cumulative CPU 872566.55 sec
> 2014-10-03 07:57:48,710 INFO  [main]: mapred.ClientServiceDelegate 
> (ClientServiceDelegate.java:getProxy(273)) - Application state is completed. 
> FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2014-10-03 07:57:48,716 ERROR [main]: exec.Task 
> (SessionState.java:printError(545)) - Ended Job = job_1412263771568_0002 with 
> exception 'java.io.IOException(Could not find status of 
> job:job_1412263771568_0002)'
> java.io.IOException: Could not find status of job:job_1412263771568_0002
>at 
> org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:294)
>at 
> org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
>at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
>at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
>at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
>at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
>at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
>at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
>at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:275)
>at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:227)
>at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430)
>at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:366)
>at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:463)
>at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:479)
>at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:759)
>at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697)
>at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:636)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:606)
>at org.apache.hadoop.util.RunJar.main(RunJa

[jira] [Updated] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()

2015-07-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11316:
-
Attachment: HIVE-11316.4.patch

> Use datastructure that doesnt duplicate any part of string for 
> ASTNode::toStringTree()
> --
>
> Key: HIVE-11316
> URL: https://issues.apache.org/jira/browse/HIVE-11316
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11316-branch-1.0.patch, 
> HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, 
> HIVE-11316.3.patch, HIVE-11316.4.patch
>
>
> HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira 
> is suppose to alter the string memoization to use a different data structure 
> that doesn't duplicate any part of the string so that we do not run into OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9900) LLAP: Integrate MiniLLAPCluster into tests

2015-07-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9900:
---
Assignee: Vikram Dixit K

> LLAP: Integrate MiniLLAPCluster into tests
> --
>
> Key: HIVE-9900
> URL: https://issues.apache.org/jira/browse/HIVE-9900
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Vikram Dixit K
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10117) LLAP: Use task number, attempt number to cache plans

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638014#comment-14638014
 ] 

Sergey Shelukhin commented on HIVE-10117:
-

Is this different from ObjectCache?

> LLAP: Use task number, attempt number to cache plans
> 
>
> Key: HIVE-10117
> URL: https://issues.apache.org/jira/browse/HIVE-10117
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
> Fix For: llap
>
>
> Instead of relying on thread locals only. This can be used to share the work 
> between Inputs / Processor / Outputs in Tez.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-07-22 Thread Michael McLellan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638012#comment-14638012
 ] 

Michael McLellan commented on HIVE-8678:


I no longer have access to the system where this was an issue. This was about 9 
months ago and we ended up working around it by just using Strings.

I don't remember anything more -  sorry I didn't write down instructions to 
reproduce when I created this.

> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11349) Update HBase metastore hbase version to 1.1.1

2015-07-22 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-11349:
--
Attachment: HIVE-11349.patch

Updated version of HBase to 1.1.1.  This breaks Tephra, but we aren't testing 
with it at the moment anyway.

> Update HBase metastore hbase version to 1.1.1
> -
>
> Key: HIVE-11349
> URL: https://issues.apache.org/jira/browse/HIVE-11349
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-11349.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-07-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637995#comment-14637995
 ] 

Thejas M Nair commented on HIVE-11304:
--

[~prasanth_j], [~hsubramaniyan] added changes to be able to change logging 
level between queries in HIVE-10119. TestOperationLoggingAPI has tests for it.
We would need to detect the current logging level for current operation in 
doAppend() (or its equivalent in log4j2)




> Migrate to Log4j2 from Log4j 1.x
> 
>
> Key: HIVE-11304
> URL: https://issues.apache.org/jira/browse/HIVE-11304
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11304.2.patch, HIVE-11304.patch
>
>
> Log4J2 has some great benefits and can benefit hive significantly. Some 
> notable features include
> 1) Performance (parametrized logging, performance when logging is disabled 
> etc.) More details can be found here 
> https://logging.apache.org/log4j/2.x/performance.html
> 2) RoutingAppender - Route logs to different log files based on MDC context 
> (useful for HS2, LLAP etc.)
> 3) Asynchronous logging
> This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637991#comment-14637991
 ] 

Eugene Koifman commented on HIVE-11348:
---

+1 pending tests

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch, HIVE-11348.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637974#comment-14637974
 ] 

Pengcheng Xiong commented on HIVE-11348:


[~ekoifman], Sure. Done.

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch, HIVE-11348.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637975#comment-14637975
 ] 

Hive QA commented on HIVE-11259:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746632/HIVE-11259.01.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4695/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4695/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4695/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4695/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at e57c360 HIVE-11077 Add support in parser and wire up to txn 
manager (Eugene Koifman, reviewed by Alan Gates)
+ git clean -f -d
Removing ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java.orig
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java.orig
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at e57c360 HIVE-11077 Add support in parser and wire up to txn 
manager (Eugene Koifman, reviewed by Alan Gates)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746632 - PreCommit-HIVE-TRUNK-Build

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11348:
---
Attachment: HIVE-11348.02.patch

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch, HIVE-11348.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637970#comment-14637970
 ] 

Hive QA commented on HIVE-11306:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746635/HIVE-11306.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9257 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_leftsemi_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4694/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4694/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4694/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746635 - PreCommit-HIVE-TRUNK-Build

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11341) Avoid expensive resizing of ASTNode tree

2015-07-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11341:
-
Attachment: HIVE-11341.1.patch

> Avoid expensive resizing of ASTNode tree 
> -
>
> Key: HIVE-11341
> URL: https://issues.apache.org/jira/browse/HIVE-11341
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11341.1.patch
>
>
> {code}
> Stack TraceSample CountPercentage(%) 
> parse.BaseSemanticAnalyzer.analyze(ASTNode, Context)   1,605   90 
>parse.CalcitePlanner.analyzeInternal(ASTNode)   1,605   90 
>   parse.SemanticAnalyzer.analyzeInternal(ASTNode, 
> SemanticAnalyzer$PlannerContext) 1,605   90 
>  parse.CalcitePlanner.genOPTree(ASTNode, 
> SemanticAnalyzer$PlannerContext)  1,604   90 
> parse.SemanticAnalyzer.genOPTree(ASTNode, 
> SemanticAnalyzer$PlannerContext) 1,604   90 
>parse.SemanticAnalyzer.genPlan(QB)  1,604   90 
>   parse.SemanticAnalyzer.genPlan(QB, boolean)  1,604   90 
>  parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map)
>  1,604   90 
> parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, 
> Operator, Map, boolean)  1,603   90 
>parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, 
> Operator, boolean)1,603   90 
>   parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, 
> RowResolver, boolean)1,603   90 
>  
> parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx)
> 1,603   90 
> 
> parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 
>  1,603   90 
>
> parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx)   1,603   90 
>   
> parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, 
> TypeCheckProcFactory)  1,603   90 
>  
> lib.DefaultGraphWalker.startWalking(Collection, HashMap)  1,579   89 
> 
> lib.DefaultGraphWalker.walk(Node)  1,571   89 
>
> java.util.ArrayList.removeAll(Collection)   1,433   81 
>   
> java.util.ArrayList.batchRemove(Collection, boolean) 1,433   81 
>  
> java.util.ArrayList.contains(Object)  1,228   69 
> 
> java.util.ArrayList.indexOf(Object)1,228   69 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637936#comment-14637936
 ] 

Eugene Koifman commented on HIVE-11348:
---

[~pxiong], could you also add some instructions in IdentifirersParser.g to 
explain why we have 2 lists and what rules should be followed when adding/not 
adding new ones? 

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637932#comment-14637932
 ] 

Pengcheng Xiong commented on HIVE-11348:


[~sershe], that is why it is a subtask. Sorry to make you disappointed. :)

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637921#comment-14637921
 ] 

Sergey Shelukhin commented on HIVE-11348:
-

This JIRA title sounds way too exciting for what the patch does :)

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-07-22 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637888#comment-14637888
 ] 

Sushanth Sowmyan commented on HIVE-8678:


Also, unit tests exist since the introduction of DATE capability that have 
tested date interop between hive and pig through HCatalog, and that still 
succeeds for me when I try running them on hive 0.13.1.

Could you please show me what hive commands and pig commands you're running to 
recreate this issue?

> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-07-22 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637886#comment-14637886
 ] 

Sushanth Sowmyan commented on HIVE-8678:


I'm currently unable to reproduce this issue on hive-1.2 and pig-0.14.0, where 
I get the following:
In hive:
{noformat}

hive> create table tdate(a string, b date) stored as orc;
OK
Time taken: 0.151 seconds
hive> create table tsource(a string, b string) stored as orc;
OK
Time taken: 0.057 seconds
hive> insert into table tsource values ("abc", "2015-02-28");
...
OK
Time taken: 19.875 seconds
hive> select * from tsource;
OK
abc 2015-02-28
Time taken: 0.143 seconds, Fetched: 1 row(s)
hive> select a, cast(b as date) from tsource;
OK
abc 2015-02-28
Time taken: 0.092 seconds, Fetched: 1 row(s)
hive> insert into table tdate select a, cast(b as date) from tsource;
...
OK
Time taken: 20.672 seconds
hive> select * from tdate;
OK
abc 2015-02-28
Time taken: 0.051 seconds, Fetched: 1 row(s)
hive> describe tdate;
OK
a   string  
b   date
Time taken: 0.293 seconds, Fetched: 2 row(s)
{noformat}

In pig:
{noformat}
grunt> A = load 'tdate' using org.apache.hive.hcatalog.pig.HCatLoader(); 
grunt> describe A;   
2015-07-22 15:42:26,367 [main] INFO  
org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is 
deprecated. Instead, use fs.defaultFS
A: {a: chararray,b: datetime}
grunt> dump A;
...
(abc,2015-02-28T00:00:00.000-08:00)
grunt>
{noformat}



> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords

2015-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11348:
---
Attachment: HIVE-11348.01.patch

[~ekoifman], could you please review the patch? Thanks. :)

> Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved 
> keywords
> -
>
> Key: HIVE-11348
> URL: https://issues.apache.org/jira/browse/HIVE-11348
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11348.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10950) Unit test against HBase Metastore

2015-07-22 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637882#comment-14637882
 ] 

Vaibhav Gumashta commented on HIVE-10950:
-

Assigning it to myself since [~daijy] is OOO. Will continue from where he left.

> Unit test against HBase Metastore
> -
>
> Key: HIVE-10950
> URL: https://issues.apache.org/jira/browse/HIVE-10950
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Daniel Dai
>Assignee: Vaibhav Gumashta
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-10950-1.patch, HIVE-10950-2.patch
>
>
> We need to run the entire Hive UT against HBase Metastore and make sure they 
> pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11331) Doc Notes

2015-07-22 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11331:
--
Description: 
This ticket is to track various doc related issues for HIVE-9675 since the 
works is spread out over time.

1. calling set autocommit = true while a transaction is open will commit the 
transaction
2. document multi-statement transactions support
3. only Queries are allowed inside an open transaction (and commit/rollback)

  was:
This ticket is to track various doc related issues for HIVE-9675 since the 
works is spread out over time.

1. calling set autocommit = true while a transaction is open will commit the 
transaction
2. document multi-statement transactions support


> Doc Notes
> -
>
> Key: HIVE-11331
> URL: https://issues.apache.org/jira/browse/HIVE-11331
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>
> This ticket is to track various doc related issues for HIVE-9675 since the 
> works is spread out over time.
> 1. calling set autocommit = true while a transaction is open will commit the 
> transaction
> 2. document multi-statement transactions support
> 3. only Queries are allowed inside an open transaction (and commit/rollback)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10950) Unit test against HBase Metastore

2015-07-22 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-10950:
---

Assignee: Vaibhav Gumashta  (was: Daniel Dai)

> Unit test against HBase Metastore
> -
>
> Key: HIVE-10950
> URL: https://issues.apache.org/jira/browse/HIVE-10950
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Daniel Dai
>Assignee: Vaibhav Gumashta
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-10950-1.patch, HIVE-10950-2.patch
>
>
> We need to run the entire Hive UT against HBase Metastore and make sure they 
> pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637877#comment-14637877
 ] 

Jesus Camacho Rodriguez commented on HIVE-11316:


[~ekoifman], [~hsubramaniyan], that sounds good to me.

> Use datastructure that doesnt duplicate any part of string for 
> ASTNode::toStringTree()
> --
>
> Key: HIVE-11316
> URL: https://issues.apache.org/jira/browse/HIVE-11316
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11316-branch-1.0.patch, 
> HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, 
> HIVE-11316.3.patch
>
>
> HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira 
> is suppose to alter the string memoization to use a different data structure 
> that doesn't duplicate any part of the string so that we do not run into OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637875#comment-14637875
 ] 

Hive QA commented on HIVE-11304:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746633/HIVE-11304.2.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9256 tests executed
*Failed tests:*
{noformat}
TestPigHBaseStorageHandler - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testNegativeCliDriver_case_with_row_sequence
org.apache.hadoop.hive.ql.log.TestLog4j2Appenders.testHiveEventCounterAppender
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithMr.testFetchResultsOfLogWithVerboseMode
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4693/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4693/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4693/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746633 - PreCommit-HIVE-TRUNK-Build

> Migrate to Log4j2 from Log4j 1.x
> 
>
> Key: HIVE-11304
> URL: https://issues.apache.org/jira/browse/HIVE-11304
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11304.2.patch, HIVE-11304.patch
>
>
> Log4J2 has some great benefits and can benefit hive significantly. Some 
> notable features include
> 1) Performance (parametrized logging, performance when logging is disabled 
> etc.) More details can be found here 
> https://logging.apache.org/log4j/2.x/performance.html
> 2) RoutingAppender - Route logs to different log files based on MDC context 
> (useful for HS2, LLAP etc.)
> 3) Asynchronous logging
> This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11347) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS

2015-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11347:
---
Attachment: HIVE-11347.01.patch

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS
> --
>
> Key: HIVE-11347
> URL: https://issues.apache.org/jira/browse/HIVE-11347
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11347.01.patch
>
>
> need to add a project on the final project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637872#comment-14637872
 ] 

Pengcheng Xiong commented on HIVE-11077:


[~ekoifman], i will submit a tiny patch soon. Thanks.

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.3.0
>
> Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, 
> HIVE-11077.6.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637793#comment-14637793
 ] 

Pengcheng Xiong commented on HIVE-11077:


SQL:2011

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.3.0
>
> Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, 
> HIVE-11077.6.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11346) Fix Unit test failures when HBase Metastore is used

2015-07-22 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11346:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9452

> Fix Unit test failures when HBase Metastore is used
> ---
>
> Key: HIVE-11346
> URL: https://issues.apache.org/jira/browse/HIVE-11346
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: hbase-metastore-branch
>Reporter: Vaibhav Gumashta
>
> Umbrella jira to track  HBase metastore UT failures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-07-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637785#comment-14637785
 ] 

Eugene Koifman commented on HIVE-11077:
---

in particular, which column in the table you referred to to follow (i.e. which 
version of the standard)

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.3.0
>
> Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, 
> HIVE-11077.6.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-07-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637782#comment-14637782
 ] 

Eugene Koifman commented on HIVE-11077:
---

[~pxiong], good catch.  If you don't mind doing this, please go ahead.  
It may also be useful to add a more detailed comment in IdentifiersParser.g 
about how to add KW_  and to which list.

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.3.0
>
> Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, 
> HIVE-11077.6.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are unusable after it

2015-07-22 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11344:

Summary: HIVE-9845 makes HCatSplit.write modify the split so that PartInfo 
objects are unusable after it  (was: HIVE-9845 makes HCatSplit.write modify the 
split so that PartitionInfo objects are unusable after it)

> HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are 
> unusable after it
> ---
>
> Key: HIVE-11344
> URL: https://issues.apache.org/jira/browse/HIVE-11344
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-11344.patch
>
>
> HIVE-9845 introduced a notion of compression for HCatSplits so that when 
> serializing, it finds commonalities between PartInfo and TableInfo objects, 
> and if the two are identical, it nulls out that field in PartInfo, thus 
> making sure that when PartInfo is then serialized, info is not repeated.
> This, however, has the side effect of making the PartInfo object unusable if 
> HCatSplit.write has been called.
> While this does not affect M/R directly, since they do not know about the 
> PartInfo objects and once serialized, the HCatSplit object is recreated by 
> deserializing on the backend, which does restore the split and its PartInfo 
> objects, this does, however, affect framework users of HCat that try to mimic 
> M/R and then use the PartInfo objects to instantiate distinct readers.
> Thus, we need to make it so that PartInfo is still usable after 
> HCatSplit.write is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637759#comment-14637759
 ] 

Sergey Shelukhin commented on HIVE-11259:
-

We were discussing putting it inside orc-encoded or some other module inside 
ORC. Would you want to clone Reader for that?

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637749#comment-14637749
 ] 

Owen O'Malley commented on HIVE-11259:
--

I can't see any other application for EncodedReader other than LLAP and it 
clearly shouldn't be part of the core ORC API.

Exposing the TreeReader API isn't great, but at least there are other 
applications for it. Certainly for a while, we'll need to mark the API as 
evolving instead of stable. From my point of view, enabling dependency 
injection isn't a bad thing for ORC. :)

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637722#comment-14637722
 ] 

Sergey Shelukhin commented on HIVE-11259:
-

It cannot be moved out of ORC unless ORC exposes a LOT of things to public, for 
everyone to create custom readers outside of the main project, and then makes 
sure to keep backward compact with all these things.

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637718#comment-14637718
 ] 

Sergey Shelukhin commented on HIVE-11259:
-

EncodedReader is part of ORC, it's not related to LLAP, it's only influenced by 
LLAP in API design.
Similar to how the fact that all Reader/RecordReader/etc. APIs were dictated by 
Hive doesn't make them part of Hive, they are still part of ORC.
It's a different interface to read ORC files. How can a factory be passed in to 
create it?
Also, if the dependency is removed, to avoid creating two different Reader-s 
with potentially 2 FS objects and files, I'd need to clone Reader/ReaderImpl 
and duplicate half the functionality, only changing the record reader type. Why 
can reader create recordReader but not encodedReader?



> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it

2015-07-22 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11344:

Attachment: HIVE-11344.patch

Patch implementing (a) attached.

> HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo 
> objects are unusable after it
> 
>
> Key: HIVE-11344
> URL: https://issues.apache.org/jira/browse/HIVE-11344
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-11344.patch
>
>
> HIVE-9845 introduced a notion of compression for HCatSplits so that when 
> serializing, it finds commonalities between PartInfo and TableInfo objects, 
> and if the two are identical, it nulls out that field in PartInfo, thus 
> making sure that when PartInfo is then serialized, info is not repeated.
> This, however, has the side effect of making the PartInfo object unusable if 
> HCatSplit.write has been called.
> While this does not affect M/R directly, since they do not know about the 
> PartInfo objects and once serialized, the HCatSplit object is recreated by 
> deserializing on the backend, which does restore the split and its PartInfo 
> objects, this does, however, affect framework users of HCat that try to mimic 
> M/R and then use the PartInfo objects to instantiate distinct readers.
> Thus, we need to make it so that PartInfo is still usable after 
> HCatSplit.write is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it

2015-07-22 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637699#comment-14637699
 ] 

Sushanth Sowmyan commented on HIVE-11344:
-

[~mithun], could you please review?

> HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo 
> objects are unusable after it
> 
>
> Key: HIVE-11344
> URL: https://issues.apache.org/jira/browse/HIVE-11344
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-11344.patch
>
>
> HIVE-9845 introduced a notion of compression for HCatSplits so that when 
> serializing, it finds commonalities between PartInfo and TableInfo objects, 
> and if the two are identical, it nulls out that field in PartInfo, thus 
> making sure that when PartInfo is then serialized, info is not repeated.
> This, however, has the side effect of making the PartInfo object unusable if 
> HCatSplit.write has been called.
> While this does not affect M/R directly, since they do not know about the 
> PartInfo objects and once serialized, the HCatSplit object is recreated by 
> deserializing on the backend, which does restore the split and its PartInfo 
> objects, this does, however, affect framework users of HCat that try to mimic 
> M/R and then use the PartInfo objects to instantiate distinct readers.
> Thus, we need to make it so that PartInfo is still usable after 
> HCatSplit.write is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637686#comment-14637686
 ] 

Owen O'Malley commented on HIVE-11259:
--

That will remove a lot of the entanglement to Allocator and DataCache.

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it

2015-07-22 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637678#comment-14637678
 ] 

Sushanth Sowmyan commented on HIVE-11344:
-

There are three routes I see available here:

a) There is decompress logic in PartInfo.setTableInfo, and compress logic in 
PartInfo.writeObject. we could make it so that PartInfo.writeObject does the 
"compression", writes itself, and then does the decompression back.
b) We could decompress on demand - wherein if a user calls 
getInputFormatClassName(), we then fetch that info if it's not available, and 
always return values consistently.
c) We could add a new conf parameter that controls whether or not we do 
compression - users with 100k splits would prefer compression, and be okay with 
the fact that PartInfo objects are not usable, and users that want to use the 
PartInfo objects will be okay with the fact that they are going to hog a little 
bit more serialized space.

(c) is a bad solution all-round. [~ashutoshc] would be mad at me for adding 
another conf parameter, and it is entirely possible that those that are trying 
to implement other streaming interfaces/etc and are mimicing M/R will run into 
a large number of partitions as well.
(b) is nifty, and I probably like the idea of, but I'm not entirely certain if 
it will run afoul of other serialization methods in the future that call 
getters to get fields (some json serializers) which might result in a bloated 
serialized PartInfo object anyway. Also, it spreads the decompression logic 
across multiple getters, and pushes the assert statement in multiple places as 
well.
(a) is probably the cleanest solution, although it makes a code reader wonder 
why we're going through the gymnastics we are. Some code comments might help 
with that.


> HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo 
> objects are unusable after it
> 
>
> Key: HIVE-11344
> URL: https://issues.apache.org/jira/browse/HIVE-11344
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> HIVE-9845 introduced a notion of compression for HCatSplits so that when 
> serializing, it finds commonalities between PartInfo and TableInfo objects, 
> and if the two are identical, it nulls out that field in PartInfo, thus 
> making sure that when PartInfo is then serialized, info is not repeated.
> This, however, has the side effect of making the PartInfo object unusable if 
> HCatSplit.write has been called.
> While this does not affect M/R directly, since they do not know about the 
> PartInfo objects and once serialized, the HCatSplit object is recreated by 
> deserializing on the backend, which does restore the split and its PartInfo 
> objects, this does, however, affect framework users of HCat that try to mimic 
> M/R and then use the PartInfo objects to instantiate distinct readers.
> Thus, we need to make it so that PartInfo is still usable after 
> HCatSplit.write is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10799) Refactor the SearchArgumentFactory to remove the dependence on ExprNodeGenericFuncDesc

2015-07-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637657#comment-14637657
 ] 

Owen O'Malley commented on HIVE-10799:
--

Given that ORC files currently have the value for char columns space-padded, we 
need to make the sarg code expand the literals to be padded to the right width. 
I don't think we need a CHAR type in the sarg API.

> Refactor the SearchArgumentFactory to remove the dependence on 
> ExprNodeGenericFuncDesc
> --
>
> Key: HIVE-10799
> URL: https://issues.apache.org/jira/browse/HIVE-10799
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-10799.patch, HIVE-10799.patch, HIVE-10799.patch, 
> HIVE-10799.patch, HIVE-10799.patch
>
>
> SearchArgumentFactory and SearchArgumentImpl are high level and shouldn't 
> depend on the internals of Hive's AST model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637649#comment-14637649
 ] 

Owen O'Malley commented on HIVE-11259:
--

You need to remove the dependence between Reader and ReaderImpl and the 
EncodedReader and EncodedReaderImpl. I'd suggest adding passing a factory 
object into the OrcFile.ReaderOptions that can control  the implementation of 
the TreeReaders and RecordReader.

Basically, the goal is to make it so that LLAP can pass in a factory object 
that lets it control the behavior of the RecordReader and TreeReaders without 
making the ORC reader depend on LLAP.


> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout

2015-07-22 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11317:
--
Description: 
the logic to Abort transactions that have stopped heartbeating is in
TxnHandler.timeOutTxns()
This is only called when DbTxnManger.getValidTxns() is called.
So if there is a lot of txns that need to be timed out and the there are not 
SQL clients talking to the system, there is nothing to abort dead transactions, 
and thus compaction can't clean them up so garbage accumulates in the system.

Also, streaming api doesn't call DbTxnManager at all.

Need to move this logic into Initiator (or some other metastore side thread).
Also, make sure it is broken up into multiple small(er) transactions against 
metastore DB.

Also more timeOutLocks() locks there as well.


see about adding TXNS.COMMENT field which can be used for "Auto aborted due to 
timeout" for example.

  was:
the logic to Abort transactions that have stopped heartbeating is in
TxnHandler.timeOutTxns()
This is only called when DbTxnManger.getValidTxns() is called.
So if there is a lot of txns that need to be timed out and the there are not 
SQL clients talking to the system, there is nothing to abort dead transactions, 
and thus compaction can't clean them up so garbage accumulates in the system.

Also, streaming api doesn't call DbTxnManager at all.

Need to move this logic into Initiator (or some other metastore side thread).
Also, make sure it is broken up into multiple small(er) transactions against 
metastore DB.

Also more timeOutLocks() locks there as well.



> ACID: Improve transaction Abort logic due to timeout
> 
>
> Key: HIVE-11317
> URL: https://issues.apache.org/jira/browse/HIVE-11317
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: triage
>
> the logic to Abort transactions that have stopped heartbeating is in
> TxnHandler.timeOutTxns()
> This is only called when DbTxnManger.getValidTxns() is called.
> So if there is a lot of txns that need to be timed out and the there are not 
> SQL clients talking to the system, there is nothing to abort dead 
> transactions, and thus compaction can't clean them up so garbage accumulates 
> in the system.
> Also, streaming api doesn't call DbTxnManager at all.
> Need to move this logic into Initiator (or some other metastore side thread).
> Also, make sure it is broken up into multiple small(er) transactions against 
> metastore DB.
> Also more timeOutLocks() locks there as well.
> see about adding TXNS.COMMENT field which can be used for "Auto aborted due 
> to timeout" for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-07-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637597#comment-14637597
 ] 

Prasanth Jayachandran commented on HIVE-11304:
--

[~gopalv] Can you please review the patch?

> Migrate to Log4j2 from Log4j 1.x
> 
>
> Key: HIVE-11304
> URL: https://issues.apache.org/jira/browse/HIVE-11304
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11304.2.patch, HIVE-11304.patch
>
>
> Log4J2 has some great benefits and can benefit hive significantly. Some 
> notable features include
> 1) Performance (parametrized logging, performance when logging is disabled 
> etc.) More details can be found here 
> https://logging.apache.org/log4j/2.x/performance.html
> 2) RoutingAppender - Route logs to different log files based on MDC context 
> (useful for HS2, LLAP etc.)
> 3) Asynchronous logging
> This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-07-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637596#comment-14637596
 ] 

Prasanth Jayachandran commented on HIVE-11304:
--

[~thejas] Can you take a look at the changes to LogDivertAppender? I am not 
sure of the purpose of doAppend() method that was there in the initial 
implementation. In my understanding, it seems to be checking for any changes in 
verbosity before writing every log line. If the verbosity changes then it 
switches to a different layout. If thats the case, under what circumstances can 
the verbosity change. Is there a test case to verify changing verbosity?

> Migrate to Log4j2 from Log4j 1.x
> 
>
> Key: HIVE-11304
> URL: https://issues.apache.org/jira/browse/HIVE-11304
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11304.2.patch, HIVE-11304.patch
>
>
> Log4J2 has some great benefits and can benefit hive significantly. Some 
> notable features include
> 1) Performance (parametrized logging, performance when logging is disabled 
> etc.) More details can be found here 
> https://logging.apache.org/log4j/2.x/performance.html
> 2) RoutingAppender - Route logs to different log files based on MDC context 
> (useful for HS2, LLAP etc.)
> 3) Asynchronous logging
> This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637598#comment-14637598
 ] 

Pengcheng Xiong commented on HIVE-11077:


Hi [~ekoifman], I came across your patch when I was looking at Hive master. I 
saw that the following key words are added to the non-reserved list. However, 
some of them are actually reserved ones (marked with R) following SQL2011 
according to 
http://www.postgresql.org/docs/9.2/static/sql-keywords-appendix.html. Adding 
them to the non-reserved list will bring in ambiguity in the grammar. Would you 
mind removing them? If you agree, i can do it for you too. Thanks. :)
{code}
KW_WORK
KW_START (R)
KW_TRANSACTION
KW_COMMIT (R)
KW_ROLLBACK (R)
KW_ONLY (R)
KW_WRITE
KW_ISOLATION
KW_LEVEL
KW_SNAPSHOT
KW_AUTOCOMMIT
{code} 

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.3.0
>
> Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, 
> HIVE-11077.6.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11343) Merge trunk to hbase-metastore branch

2015-07-22 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates resolved HIVE-11343.
---
   Resolution: Fixed
Fix Version/s: hbase-metastore-branch

Done.

> Merge trunk to hbase-metastore branch
> -
>
> Key: HIVE-11343
> URL: https://issues.apache.org/jira/browse/HIVE-11343
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: hbase-metastore-branch
>
>
> Periodic merge



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11333) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner prunes columns of UnionOperator that should be kept

2015-07-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11333:
---
Attachment: HIVE-11333.02.patch

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner 
> prunes columns of UnionOperator that should be kept
> --
>
> Key: HIVE-11333
> URL: https://issues.apache.org/jira/browse/HIVE-11333
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11333.01.patch, HIVE-11333.02.patch
>
>
> unionOperator will have the schema following the operator in the first 
> branch. Because ColumnPruner prunes columns based on the internal name, the 
> column in other branches may be pruned due to a different internal name from 
> the first branch. To repro, run rcfile_union.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11331) Doc Notes

2015-07-22 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11331:
--
Description: 
This ticket is to track various doc related issues for HIVE-9675 since the 
works is spread out over time.

1. calling set autocommit = true while a transaction is open will commit the 
transaction
2. document multi-statement transactions support

  was:
This ticket is to track various doc related issues for HIVE-9675 since the 
works is spread out over time.

1. calling set autocommit = true while a transaction is open will commit the 
transaction
2. 


> Doc Notes
> -
>
> Key: HIVE-11331
> URL: https://issues.apache.org/jira/browse/HIVE-11331
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>
> This ticket is to track various doc related issues for HIVE-9675 since the 
> works is spread out over time.
> 1. calling set autocommit = true while a transaction is open will commit the 
> transaction
> 2. document multi-statement transactions support



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11306:
---
Attachment: HIVE-11306.2.patch

Fix assertion

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-07-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11304:
-
Attachment: HIVE-11304.2.patch

Should fix test failures.

> Migrate to Log4j2 from Log4j 1.x
> 
>
> Key: HIVE-11304
> URL: https://issues.apache.org/jira/browse/HIVE-11304
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11304.2.patch, HIVE-11304.patch
>
>
> Log4J2 has some great benefits and can benefit hive significantly. Some 
> notable features include
> 1) Performance (parametrized logging, performance when logging is disabled 
> etc.) More details can be found here 
> https://logging.apache.org/log4j/2.x/performance.html
> 2) RoutingAppender - Route logs to different log files based on MDC context 
> (useful for HS2, LLAP etc.)
> 3) Asynchronous logging
> This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11306:
---
Attachment: (was: HIVE-11306.2.patch)

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637575#comment-14637575
 ] 

Sergey Shelukhin commented on HIVE-11259:
-

Got rid of TrackedCacheChunk, renamed confusingly named StreamBuffer, added 
better comments to it and ProcCacheChunk.

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11259:

Attachment: HIVE-11259.01.patch

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.01.patch, HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11306:
---
Attachment: (was: HIVE-11306.2.patch)

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11306:
---
Attachment: HIVE-11306.2.patch

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11306:
---
Attachment: HIVE-11306.2.patch

[~wzheng]: Updated patch to make sure that the hashMapResult has 0 rows, for 
the NOMATCH case.

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()

2015-07-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637541#comment-14637541
 ] 

Eugene Koifman commented on HIVE-11316:
---

My concern was with the modification of the existing toStringTree() in a 
dangerous way.
patch 3 keeps toStringTree() as is and adds a new "optimized" method.

[~hsubramaniyan] and I just discussed and there is a best of both worlds 
approach.
ASTNode only has 5 or 6 (inherited) methods that allow tree modification. We 
could overload each one to set a flag that says cached "toString" needs to be 
recomputed. This way no one even has to know about caching.  
[~jcamachorodriguez], does this seem reasonable?



> Use datastructure that doesnt duplicate any part of string for 
> ASTNode::toStringTree()
> --
>
> Key: HIVE-11316
> URL: https://issues.apache.org/jira/browse/HIVE-11316
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11316-branch-1.0.patch, 
> HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, 
> HIVE-11316.3.patch
>
>
> HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira 
> is suppose to alter the string memoization to use a different data structure 
> that doesn't duplicate any part of the string so that we do not run into OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637538#comment-14637538
 ] 

Wei Zheng commented on HIVE-11306:
--

Only for the left outer join case, we will set joinNeeded to true if the return 
isn't SPILL.

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637524#comment-14637524
 ] 

Gopal V commented on HIVE-11306:


Can't be, because the test-case that's broken is an inner join.

All further checks are actually for {{if (joinNeeded) {}}

So, the core issue is that there's no check for MATCH, so the inner join thinks 
it got a MATCH if the return isn't SPILL.

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637516#comment-14637516
 ] 

Wei Zheng commented on HIVE-11306:
--

Looks like that's the problem. The bloom test early-determines nomatch, which 
is good, but broke the left outer join assumption.

So maybe the right logic should be:
{code}
if (!bloom1.testLong(keyHash) && !isOnDisk(partitionId)) {
  ... 
  return JoinUtil.JoinResult.NOMATCH;
} 
// otherwise just pass long to the next round (join for spill partition) to 
decide what to do
{code}

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-07-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637461#comment-14637461
 ] 

Gopal V commented on HIVE-11306:


Looks like this might be due to the {{&& joinResult != 
JoinUtil.JoinResult.SPILL)}} in the MapJoinOperator::process().

{code}
if (!noOuterJoin) {
  // For Hybrid Grace Hash Join, during the 1st round processing,
  // we only keep the LEFT side if the row is not spilled
  if (!conf.isHybridHashJoin() || hybridMapJoinLeftover
  || (!hybridMapJoinLeftover && joinResult != 
JoinUtil.JoinResult.SPILL)) {
joinNeeded = true;
storage[pos] = dummyObjVectors[pos];
  }
} else {
  storage[pos] = emptyList;
}
{code}

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11306.1.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11305) LLAP: Hybrid Map-join cache returns invalid data

2015-07-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637457#comment-14637457
 ] 

Hive QA commented on HIVE-11305:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746436/HIVE-11305.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4690/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4690/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4690/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4690/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   2f0ae24..afab133  branch-1.1 -> origin/branch-1.1
   1a1c0d8..a310524  hbase-metastore -> origin/hbase-metastore
   72f97fc..2240dbd  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 72f97fc HIVE-11303: Getting Tez LimitExceededException after dag 
execution on large query (Jason Dere, reviewed by Gopal V)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at 2240dbd HIVE-11254 Process result sets returned by a stored 
procedure (Dmitry Tolpeko via gates)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746436 - PreCommit-HIVE-TRUNK-Build

> LLAP: Hybrid Map-join cache returns invalid data 
> -
>
> Key: HIVE-11305
> URL: https://issues.apache.org/jira/browse/HIVE-11305
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
> Environment: TPC-DS 200 scale data
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Critical
> Fix For: llap
>
> Attachments: HIVE-11305.patch, q55-test.sql
>
>
> Start a 1-node LLAP cluster with 16 executors and run attached test-case on 
> the single node instance.
> {code}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer cannot be 
> cast to 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.loadHashTable(VectorMapJoinCommonOperator.java:648)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:314)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1104)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileCha

[jira] [Updated] (HIVE-11300) HBase metastore: Support token and master key methods

2015-07-22 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-11300:
--
Attachment: HIVE-11300.2.patch

A new version of the patch rebased after HIVE-11294 was checked in.

> HBase metastore: Support token and master key methods
> -
>
> Key: HIVE-11300
> URL: https://issues.apache.org/jira/browse/HIVE-11300
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-11300.2.patch, HIVE-11300.patch
>
>
> The methods addToken, removeToken, getToken, getAllTokenIdentifiers, 
> addMasterKey, updateMasterKey, removeMasterKey, and getMasterKeys() need to 
> be implemented.  They are all used in security.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11305) LLAP: Hybrid Map-join cache returns invalid data

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637408#comment-14637408
 ] 

Sergey Shelukhin commented on HIVE-11305:
-

ping? [~gopalv]

> LLAP: Hybrid Map-join cache returns invalid data 
> -
>
> Key: HIVE-11305
> URL: https://issues.apache.org/jira/browse/HIVE-11305
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
> Environment: TPC-DS 200 scale data
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Critical
> Fix For: llap
>
> Attachments: HIVE-11305.patch, q55-test.sql
>
>
> Start a 1-node LLAP cluster with 16 executors and run attached test-case on 
> the single node instance.
> {code}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer cannot be 
> cast to 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.loadHashTable(VectorMapJoinCommonOperator.java:648)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:314)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1104)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:37)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> ... 17 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1

2015-07-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637407#comment-14637407
 ] 

Sergey Shelukhin commented on HIVE-11259:
-

Actually, nm, 3 is also impossible, Boolean cannot be set, so it cannot be used 
as an out parameter.

> LLAP: clean up ORC dependencies part 1
> --
>
> Key: HIVE-11259
> URL: https://issues.apache.org/jira/browse/HIVE-11259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11259.patch
>
>
> Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11254) Process result sets returned by a stored procedure

2015-07-22 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637394#comment-14637394
 ] 

Dmitry Tolpeko commented on HIVE-11254:
---

Thanks, Alan.

> Process result sets returned by a stored procedure
> --
>
> Key: HIVE-11254
> URL: https://issues.apache.org/jira/browse/HIVE-11254
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Fix For: 2.0.0
>
> Attachments: HIVE-11254.1.patch, HIVE-11254.2.patch, 
> HIVE-11254.3.patch, HIVE-11254.4.patch
>
>
> Stored procedure can return one or more result sets. A caller should be able 
> to process them.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11341) Avoid expensive resizing of ASTNode tree

2015-07-22 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637282#comment-14637282
 ] 

Mostafa Mokhtar commented on HIVE-11341:


[~hagleitn] [~jcamachorodriguez] [~hsubramaniyan]
FYI 

> Avoid expensive resizing of ASTNode tree 
> -
>
> Key: HIVE-11341
> URL: https://issues.apache.org/jira/browse/HIVE-11341
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> {code}
> Stack TraceSample CountPercentage(%) 
> parse.BaseSemanticAnalyzer.analyze(ASTNode, Context)   1,605   90 
>parse.CalcitePlanner.analyzeInternal(ASTNode)   1,605   90 
>   parse.SemanticAnalyzer.analyzeInternal(ASTNode, 
> SemanticAnalyzer$PlannerContext) 1,605   90 
>  parse.CalcitePlanner.genOPTree(ASTNode, 
> SemanticAnalyzer$PlannerContext)  1,604   90 
> parse.SemanticAnalyzer.genOPTree(ASTNode, 
> SemanticAnalyzer$PlannerContext) 1,604   90 
>parse.SemanticAnalyzer.genPlan(QB)  1,604   90 
>   parse.SemanticAnalyzer.genPlan(QB, boolean)  1,604   90 
>  parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map)
>  1,604   90 
> parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, 
> Operator, Map, boolean)  1,603   90 
>parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, 
> Operator, boolean)1,603   90 
>   parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, 
> RowResolver, boolean)1,603   90 
>  
> parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx)
> 1,603   90 
> 
> parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 
>  1,603   90 
>
> parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx)   1,603   90 
>   
> parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, 
> TypeCheckProcFactory)  1,603   90 
>  
> lib.DefaultGraphWalker.startWalking(Collection, HashMap)  1,579   89 
> 
> lib.DefaultGraphWalker.walk(Node)  1,571   89 
>
> java.util.ArrayList.removeAll(Collection)   1,433   81 
>   
> java.util.ArrayList.batchRemove(Collection, boolean) 1,433   81 
>  
> java.util.ArrayList.contains(Object)  1,228   69 
> 
> java.util.ArrayList.indexOf(Object)1,228   69 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-07-22 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637257#comment-14637257
 ] 

Alan Gates commented on HIVE-11077:
---

+1

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, 
> HIVE-11077.6.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11335) Multi-Join Inner Query producing incorrect results

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-11335.

Resolution: Duplicate

> Multi-Join Inner Query producing incorrect results
> --
>
> Key: HIVE-11335
> URL: https://issues.apache.org/jira/browse/HIVE-11335
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Jesus Camacho Rodriguez
> Attachments: query1.txt, query2.txt
>
>
> test step
> {code}
> create table log (uid string, uid2 string);
> insert into log values ('1', '1');
> create table user (uid string, name string);
> insert into user values ('1', "test1");
> {code}
> (Query1)
> {code}
> select b.name, c.name from log a
>  left outer join (select uid, name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid2=c.uid);
> {code}
> return wrong result:
> 1 test1
> It should be both return test1
> (Query2)I try to find error, if I use this query, return right result.(join 
> key different)
> {code}
> select b.name, c.name from log a
>  left outer join (select uid, name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid=c.uid);
> {code}
> The explain is different,Query1 only select one colum. It should select uid 
> and name.
> {code}
> b:user 
>   TableScan
> alias: user
> Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column 
> stats: NONE
> Select Operator
>   expressions: uid (type: string)
>   outputColumnNames: _col0
> {code}
> It may relate HIVE-10996
> =UPDATE1===
> (Query3) this query return correct result
> {code}
> select b.name, c.name from log a
>  left outer join (select user.uid, user.name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid2=c.uid);
> {code}
> the operator tree
> TS[0]-SEL[1]-RS[5]-JOIN[6]-RS[7]-JOIN[9]-SEL[10]-FS[11]
> TS[2]-RS[4]-JOIN[6]
> TS[3]-RS[8]-JOIN[9]
> the Query1 SEL[1] rowSchema is wrong, cannot detect the tabAlias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9613) Left join query plan outputs wrong column when using subquery

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9613:
--
Fix Version/s: 1.1.1

> Left join query plan outputs  wrong column when using subquery
> --
>
> Key: HIVE-9613
> URL: https://issues.apache.org/jira/browse/HIVE-9613
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Planning
>Affects Versions: 0.14.0, 1.0.0
> Environment: apache hadoop 2.5.1 
>Reporter: Li Xin
>Assignee: Gunther Hagleitner
> Fix For: 1.2.0, 1.1.1
>
> Attachments: HIVE-9613.1.patch, test.sql
>
>
> I have a query that outputs a column with wrong contents when using 
> subquery,and the contents of that column is equal to another column,not its 
> own.
> I have three tables,as follows:
> table 1: _hivetemp.category_city_rank_:
> ||category||city||rank||
> |jinrongfuwu|shanghai|1|
> |ktvjiuba|shanghai|2|
> table 2:_hivetemp.category_match_:
> ||src_category_en||src_category_cn||dst_category_en||dst_category_cn||
> |danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu|
> |zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba|
> table 3:_hivetemp.city_match_:
> ||src_city_name_en||dst_city_name_en||city_name_cn||
> |sh|shanghai|上海|
> And the query is :
> {code}
> select
> a.category,
> a.city,
> a.rank,
> b.src_category_en,
> c.src_city_name_en
> from
> hivetemp.category_city_rank a
> left outer join
> (select
> src_category_en,
> dst_category_en
> from
> hivetemp.category_match) b
> on  a.category = b.dst_category_en
> left outer join
> (select
> src_city_name_en,
> dst_city_name_en
> from
> hivetemp.city_match) c
> on  a.city = c.dst_city_name_en
> {code}
> which shoud output the results as follows,and i test it in hive 0.13:
> ||category||city||rank||src_category_en||src_city_name_en||
> |jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh|
> |ktvjiuba|shanghai|2|zpwentiyingshi|sh|
> but int hive0.14,the results in the column *src_category_en* is wrong,and is 
> just the *city* contents:
> ||category||city||rank||src_category_en||src_city_name_en||
> |jinrongfuwu|shanghai|1|shanghai|sh|
> |ktvjiuba|shanghai|2|shanghai|sh|
> Using explain to examine the execution plan,i can see the first subquery just 
> outputs one column of *dst_category_en*,and *src_category_en* is just missing.
> {quote}
>b:category_match
>   TableScan
> alias: category_match
> Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: dst_category_en (type: string)
>   outputColumnNames: _col1
>   Statistics: Num rows: 131 Data size: 13149 Basic stats: 
> COMPLETE Column stats: NONE
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11335) Multi-Join Inner Query producing incorrect results

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637238#comment-14637238
 ] 

Jesus Camacho Rodriguez commented on HIVE-11335:


[~fatkun], this is a duplicate of HIVE-9613, which had not been committed to 
1.1 (in fact, the issue cannot be reproduced in other versions). With that 
patch, the problem is solved. I have just backported it, thus I mark this one 
as duplicate.

> Multi-Join Inner Query producing incorrect results
> --
>
> Key: HIVE-11335
> URL: https://issues.apache.org/jira/browse/HIVE-11335
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Jesus Camacho Rodriguez
> Attachments: query1.txt, query2.txt
>
>
> test step
> {code}
> create table log (uid string, uid2 string);
> insert into log values ('1', '1');
> create table user (uid string, name string);
> insert into user values ('1', "test1");
> {code}
> (Query1)
> {code}
> select b.name, c.name from log a
>  left outer join (select uid, name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid2=c.uid);
> {code}
> return wrong result:
> 1 test1
> It should be both return test1
> (Query2)I try to find error, if I use this query, return right result.(join 
> key different)
> {code}
> select b.name, c.name from log a
>  left outer join (select uid, name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid=c.uid);
> {code}
> The explain is different,Query1 only select one colum. It should select uid 
> and name.
> {code}
> b:user 
>   TableScan
> alias: user
> Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column 
> stats: NONE
> Select Operator
>   expressions: uid (type: string)
>   outputColumnNames: _col0
> {code}
> It may relate HIVE-10996
> =UPDATE1===
> (Query3) this query return correct result
> {code}
> select b.name, c.name from log a
>  left outer join (select user.uid, user.name from user) b on (a.uid=b.uid)
>  left outer join user c on (a.uid2=c.uid);
> {code}
> the operator tree
> TS[0]-SEL[1]-RS[5]-JOIN[6]-RS[7]-JOIN[9]-SEL[10]-FS[11]
> TS[2]-RS[4]-JOIN[6]
> TS[3]-RS[8]-JOIN[9]
> the Query1 SEL[1] rowSchema is wrong, cannot detect the tabAlias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function

2015-07-22 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637229#comment-14637229
 ] 

Yongzhi Chen commented on HIVE-11271:
-

[~pxiong], thanks for your advice. Your suggestion should work, I just do not 
understand why change run time code is not good. In your plan, you add extra  
SEL which is just same as what the run time map in this patch do, and the extra 
select should not have better performance than the Filter with input to output 
map. And another good thing with my change is that I need 0 q file change. : )
I would be happy to make changes as you suggest if you or [~ashutoshc] can 
explain why the change has to be in compile time. Thanks

> java.lang.IndexOutOfBoundsException when union all with if function
> ---
>
> Key: HIVE-11271
> URL: https://issues.apache.org/jira/browse/HIVE-11271
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11271.1.patch
>
>
> Some queries with Union all as subquery fail in MapReduce task with 
> stacktrace:
> {noformat}
> 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing 
> operator UNION[104]
> 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor 
> complete.
> 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: 
> job_local826862759_0005
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 10 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
>   ... 21 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java

[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function

2015-07-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637177#comment-14637177
 ] 

Pengcheng Xiong commented on HIVE-11271:


[~ashutoshc], thanks a lot for your attention. I applied the patch in 
HIVE-11333 and it seems that it can not solve the problem here.  The problem 
here is that we have FIL-UNION, FIL has to have 2 columns (1 for union and 1 
for predicate). The problem in HIVE-11333 is that we have SEL-UNION, because of 
return path, the column in SEL got wrongly pruned. 

However, I still agree with you that a better fix should be at compile time, 
not run time. [~ychena], it seems that this problem is similar to the issue 
mentioned in https://issues.apache.org/jira/browse/HIVE-10996 although they are 
dealing with JOIN. A similar solution by adding a SEL may be like this: (1) In 
ColumnPruner, when dealing with FIL, check if the needed columns (from its 
child) and check the columns used in predicate. (2) If the former one contains 
the latter one, continue (no problem); else insert a SEL which just select the 
needed columns in between the FIL and its child. (3) This solution is happening 
at compile time, not run time. But it may involve many q files update. Thanks.

> java.lang.IndexOutOfBoundsException when union all with if function
> ---
>
> Key: HIVE-11271
> URL: https://issues.apache.org/jira/browse/HIVE-11271
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11271.1.patch
>
>
> Some queries with Union all as subquery fail in MapReduce task with 
> stacktrace:
> {noformat}
> 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing 
> operator UNION[104]
> 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor 
> complete.
> 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: 
> job_local826862759_0005
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 10 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
>   ... 21 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.Unio

[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11328:
---
Attachment: (was: HIVE-11310.4.branch-1.2.patch)

> Avoid String representation of expression nodes in 
> ConstantPropagateProcFactory unless necessary
> 
>
> Key: HIVE-11328
> URL: https://issues.apache.org/jira/browse/HIVE-11328
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11328.branch-1.0.patch, 
> HIVE-11328.branch-1.2.patch, HIVE-11328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11328:
---
Attachment: HIVE-11328.branch-1.2.patch

> Avoid String representation of expression nodes in 
> ConstantPropagateProcFactory unless necessary
> 
>
> Key: HIVE-11328
> URL: https://issues.apache.org/jira/browse/HIVE-11328
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11328.branch-1.0.patch, 
> HIVE-11328.branch-1.2.patch, HIVE-11328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11328:
---
Attachment: HIVE-11310.4.branch-1.2.patch

> Avoid String representation of expression nodes in 
> ConstantPropagateProcFactory unless necessary
> 
>
> Key: HIVE-11328
> URL: https://issues.apache.org/jira/browse/HIVE-11328
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11310.4.branch-1.2.patch, 
> HIVE-11328.branch-1.0.patch, HIVE-11328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout

2015-07-22 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated HIVE-11317:
--
Labels: triage  (was: )

> ACID: Improve transaction Abort logic due to timeout
> 
>
> Key: HIVE-11317
> URL: https://issues.apache.org/jira/browse/HIVE-11317
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: triage
>
> the logic to Abort transactions that have stopped heartbeating is in
> TxnHandler.timeOutTxns()
> This is only called when DbTxnManger.getValidTxns() is called.
> So if there is a lot of txns that need to be timed out and the there are not 
> SQL clients talking to the system, there is nothing to abort dead 
> transactions, and thus compaction can't clean them up so garbage accumulates 
> in the system.
> Also, streaming api doesn't call DbTxnManager at all.
> Need to move this logic into Initiator (or some other metastore side thread).
> Also, make sure it is broken up into multiple small(er) transactions against 
> metastore DB.
> Also more timeOutLocks() locks there as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11328:
---
Attachment: HIVE-11328.branch-1.0.patch

> Avoid String representation of expression nodes in 
> ConstantPropagateProcFactory unless necessary
> 
>
> Key: HIVE-11328
> URL: https://issues.apache.org/jira/browse/HIVE-11328
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11328.branch-1.0.patch, HIVE-11328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11209) Clean up dependencies in HiveDecimalWritable

2015-07-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637099#comment-14637099
 ] 

Owen O'Malley commented on HIVE-11209:
--

For most types, it would be a problem. With decimal, we are doing so many 
allocations in the inner loop that this won't be noticeable. We really need to 
fix or even better reimplement the Decimal128 to get good performance on 
decimal columns.

If you really want me to fix this instance, how about a thread local. Hive's 
over use of static caches has been a huge source of problems.

> Clean up dependencies in HiveDecimalWritable
> 
>
> Key: HIVE-11209
> URL: https://issues.apache.org/jira/browse/HIVE-11209
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.0.0
>
> Attachments: HIVE-11209.patch, HIVE-11209.patch, HIVE-11209.patch, 
> HIVE-11209.patch
>
>
> Currently HiveDecimalWritable depends on:
> * org.apache.hadoop.hive.serde2.ByteStream
> * org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils
> * org.apache.hadoop.hive.serde2.typeinfo.HiveDecimalUtils
> since we need HiveDecimalWritable for the decimal VectorizedColumnBatch, 
> breaking these dependencies will improve things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11310) Avoid expensive AST tree conversion to String for expressions in WHERE clause

2015-07-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11310:
---
Attachment: HIVE-11310.4.branch-1.0.patch

> Avoid expensive AST tree conversion to String for expressions in WHERE clause
> -
>
> Key: HIVE-11310
> URL: https://issues.apache.org/jira/browse/HIVE-11310
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11310.1.patch, HIVE-11310.2.patch, 
> HIVE-11310.3.patch, HIVE-11310.4.branch-1.0.patch, 
> HIVE-11310.4.branch-1.2.patch, HIVE-11310.4.patch, HIVE-11310.patch
>
>
> We use the AST tree String representation of a condition in the WHERE clause 
> to identify its column in the RowResolver. This can lead to OOM Exceptions 
> when the condition is very large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11341) Avoid expensive resizing of ASTNode tree

2015-07-22 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-11341:
---
Summary: Avoid expensive resizing of ASTNode tree   (was: Avoid resizing of 
ASTNode tree )

> Avoid expensive resizing of ASTNode tree 
> -
>
> Key: HIVE-11341
> URL: https://issues.apache.org/jira/browse/HIVE-11341
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> {code}
> Stack TraceSample CountPercentage(%) 
> parse.BaseSemanticAnalyzer.analyze(ASTNode, Context)   1,605   90 
>parse.CalcitePlanner.analyzeInternal(ASTNode)   1,605   90 
>   parse.SemanticAnalyzer.analyzeInternal(ASTNode, 
> SemanticAnalyzer$PlannerContext) 1,605   90 
>  parse.CalcitePlanner.genOPTree(ASTNode, 
> SemanticAnalyzer$PlannerContext)  1,604   90 
> parse.SemanticAnalyzer.genOPTree(ASTNode, 
> SemanticAnalyzer$PlannerContext) 1,604   90 
>parse.SemanticAnalyzer.genPlan(QB)  1,604   90 
>   parse.SemanticAnalyzer.genPlan(QB, boolean)  1,604   90 
>  parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map)
>  1,604   90 
> parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, 
> Operator, Map, boolean)  1,603   90 
>parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, 
> Operator, boolean)1,603   90 
>   parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, 
> RowResolver, boolean)1,603   90 
>  
> parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx)
> 1,603   90 
> 
> parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 
>  1,603   90 
>
> parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx)   1,603   90 
>   
> parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, 
> TypeCheckProcFactory)  1,603   90 
>  
> lib.DefaultGraphWalker.startWalking(Collection, HashMap)  1,579   89 
> 
> lib.DefaultGraphWalker.walk(Node)  1,571   89 
>
> java.util.ArrayList.removeAll(Collection)   1,433   81 
>   
> java.util.ArrayList.batchRemove(Collection, boolean) 1,433   81 
>  
> java.util.ArrayList.contains(Object)  1,228   69 
> 
> java.util.ArrayList.indexOf(Object)1,228   69 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >