date:20151001

[jira] [Commented] (HIVE-4243) Fix column names in FileSinkOperator

2015-10-01 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939676#comment-14939676
 ] 

Owen O'Malley commented on HIVE-4243:
-

The first two tests pass locally with the patch rebased to master. The last two 
tests are unrelated and fail on master without the patch.

> Fix column names in FileSinkOperator
> 
>
> Key: HIVE-4243
> URL: https://issues.apache.org/jira/browse/HIVE-4243
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.patch, 
> HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch
>
>
> All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual 
> column names. Since the files are part of tables, Hive knows the column 
> names. For self-describing file formats like ORC, having the real column 
> names will improve the understandability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11973) IN operator fails when the column type is DATE

2015-10-01 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11973:

Attachment: (was: HIVE-11973.1.patch)

> IN operator fails when the column type is DATE 
> ---
>
> Key: HIVE-11973
> URL: https://issues.apache.org/jira/browse/HIVE-11973
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0
>Reporter: sanjiv singh
>Assignee: Yongzhi Chen
>
> Test DLL :
> {code}
> CREATE TABLE `date_dim`(
>   `d_date_sk` int, 
>   `d_date_id` string, 
>   `d_date` date, 
>   `d_current_week` string, 
>   `d_current_month` string, 
>   `d_current_quarter` string, 
>   `d_current_year` string) ;
> {code}
> Hive query :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN ('2000-03-22','2001-03-22')  ;
> {code}
> In 1.0.0 ,  the above query fails with:
> {code}
> FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments 
> ''2001-03-22'': The arguments for IN should be the same type! Types are: 
> {date IN (string, string)}
> {code}
> I changed the query as given to pass the error :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) 
>  )  ;
> {code}
> But it works without casting  :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date   = '2000-03-22' ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11973) IN operator fails when the column type is DATE

2015-10-01 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11973:

Attachment: HIVE-11973.1.patch

Not sure why this patch is not in the waiting list of pre-commit build. 
Re-attach it.

> IN operator fails when the column type is DATE 
> ---
>
> Key: HIVE-11973
> URL: https://issues.apache.org/jira/browse/HIVE-11973
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0
>Reporter: sanjiv singh
>Assignee: Yongzhi Chen
> Attachments: HIVE-11973.1.patch
>
>
> Test DLL :
> {code}
> CREATE TABLE `date_dim`(
>   `d_date_sk` int, 
>   `d_date_id` string, 
>   `d_date` date, 
>   `d_current_week` string, 
>   `d_current_month` string, 
>   `d_current_quarter` string, 
>   `d_current_year` string) ;
> {code}
> Hive query :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN ('2000-03-22','2001-03-22')  ;
> {code}
> In 1.0.0 ,  the above query fails with:
> {code}
> FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments 
> ''2001-03-22'': The arguments for IN should be the same type! Types are: 
> {date IN (string, string)}
> {code}
> I changed the query as given to pass the error :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) 
>  )  ;
> {code}
> But it works without casting  :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date   = '2000-03-22' ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe

2015-10-01 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11785:

Attachment: HIVE-11785.2.patch

Update all the affected thrift code and update the unit test baselines.

> Support escaping carriage return and new line for LazySimpleSerDe
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.0.0
>
> Attachments: HIVE-11785.2.patch, HIVE-11785.patch, test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. 
> The expected result should be:
> {noformat}
> 1 newline
> here
> 2 carriage return
> 3 both
> here
> {noformat}
> {noformat}
> hive> create table repo (lvalue int, charstring string) stored as parquet;
> OK
> Time taken: 0.34 seconds
> hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
> Loading data to table default.repo
> chgrp: changing ownership of 
> 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
> belong to hive
> Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, 
> rawDataSize=0]
> OK
> Time taken: 0.732 seconds
> hive> set hive.fetch.task.conversion=more;
> hive> select * from repo;
> OK
> 1 newline
> here
> here  carriage return
> 3 both
> here
> Time taken: 0.253 seconds, Fetched: 3 row(s)
> hive> set hive.fetch.task.conversion=none;
> hive> select * from repo;
> Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441752031022_0006, Tracking URL = 
> http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
> Kill Command = 
> /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
> -kill job_1441752031022_0006
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
> 2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 980 msec
> Ended Job = job_1441752031022_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS 
> Write: 51 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 980 msec
> OK
> 1 newline
> NULL  NULL
> 2 carriage return
> NULL  NULL
> 3 both
> NULL  NULL
> Time taken: 25.131 seconds, Fetched: 6 row(s)
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11980) Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON

2015-10-01 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940004#comment-14940004
 ] 

Aihua Xu commented on HIVE-11980:
-

The tests seem not related to the patch.

[~szehon] can you help review the change? 

> Follow up on HIVE-11696, exception is thrown from CTAS from the table with 
> table-level serde is Parquet while partition-level serde is JSON
> ---
>
> Key: HIVE-11980
> URL: https://issues.apache.org/jira/browse/HIVE-11980
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11980.patch
>
>
> When we create a new table from the table with table-level serde to be 
> Parquet and partition-level serde to be JSON, currently the following 
> exception will be thrown if there are struct fields.
> Apparently, getStructFieldsDataAsList() also needs to handle the case of List 
> in addition to ArrayWritable similar to getStructFieldData.
> {noformat}
> Caused by: java.lang.UnsupportedOperationException: Cannot inspect 
> java.util.ArrayList
> at 
> org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241)
> at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking

2015-10-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6207:
-
Assignee: (was: Eugene Koifman)

> Integrate HCatalog with locking
> ---
>
> Key: HIVE-6207
> URL: https://issues.apache.org/jira/browse/HIVE-6207
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Alan Gates
> Attachments: ACIDHCatalogDesign.pdf, HIVE-6207.patch
>
>
> HCatalog currently ignores any locks created by Hive users.  It should 
> respect the locks Hive creates as well as create locks itself when locking is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization

2015-10-01 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940066#comment-14940066
 ] 

Thejas M Nair commented on HIVE-11175:
--

Sorry about the delay in reviewing the patch!
The changes look good. However, looks like the new test is failing because of 
difference in paths of jar as seen in the q.out files.
Also looks like some of the old udf test cases also need to be updated. We 
should check for presence of additional input entity and one less output 
entity. 
The test failure in TestHCatClient.testTableSchemaPropagation seems to be 
unrelated.


> create function using jar does not work with sql std authorization
> --
>
> Key: HIVE-11175
> URL: https://issues.apache.org/jira/browse/HIVE-11175
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Olaf Flebbe
> Fix For: 2.0.0
>
> Attachments: HIVE-11175.1.patch
>
>
> {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} 
> gives error code for need of accessing a local foo.jar  resource with ADMIN 
> privileges. Same for HDFS (DFS_URI) .
> problem is that the semantic analysis enforces the ADMIN privilege for write 
> but the jar is clearly input not output. 
> Patch und Testcase appendend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12005) Remove hbase based stats collection mechanism

2015-10-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12005:
-
Affects Version/s: 2.0.0

> Remove hbase based stats collection mechanism
> -
>
> Key: HIVE-12005
> URL: https://issues.apache.org/jira/browse/HIVE-12005
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12005.patch
>
>
> Currently, hbase is one of the mechanism to collect and store statistics. I 
> have never come across anyone using it. FileSystem based collection mechanism 
> is default for few releases and is working well. We shall remove hbase stats 
> collector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12005) Remove hbase based stats collection mechanism

2015-10-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940147#comment-14940147
 ] 

Prasanth Jayachandran commented on HIVE-12005:
--

LGTM, +1. Pending tests.

Are we getting rid of JDBC as well? JDBC is giving more trouble related to 
maxKeyPrefix across different DBs. We are patching it here and there to work 
around the limits imposed by different DBs.

> Remove hbase based stats collection mechanism
> -
>
> Key: HIVE-12005
> URL: https://issues.apache.org/jira/browse/HIVE-12005
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12005.patch
>
>
> Currently, hbase is one of the mechanism to collect and store statistics. I 
> have never come across anyone using it. FileSystem based collection mechanism 
> is default for few releases and is working well. We shall remove hbase stats 
> collector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-10-01 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940101#comment-14940101
 ] 

Daniel Dai commented on HIVE-10752:
---

This should be committed to branch-1 as well. Also create HIVE-12006 to redo it 
in a right way.

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10752.patch
>
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12006) Enable Columnar Pushdown for RC/ORC File for HCatLoader

2015-10-01 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved HIVE-12006.
---
Resolution: Duplicate

Sorry didn't realize there is one already. Sure, I will take a look.

> Enable Columnar Pushdown for RC/ORC File for HCatLoader
> ---
>
> Key: HIVE-12006
> URL: https://issues.apache.org/jira/browse/HIVE-12006
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 1.2.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
>
> This initially enabled by HIVE-5193. However, HIVE-10752 reverted it since 
> there is issue in original implementation.
> We shall fix the issue an reenable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size

2015-10-01 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11720:

Attachment: HIVE-11720.4.patch

[~thejas] It is indeed needed. Added the change.

> Allow HiveServer2 to set custom http request/response header size
> -
>
> Key: HIVE-11720
> URL: https://issues.apache.org/jira/browse/HIVE-11720
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, 
> HIVE-11720.3.patch, HIVE-11720.4.patch
>
>
> In HTTP transport mode, authentication information is sent over as part of 
> HTTP headers. Sometimes (observed when Kerberos is used) the default buffer 
> size for the headers is not enough, resulting in an HTTP 413 FULL head error. 
> We can expose those as customizable params.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11972) [Refactor] Improve determination of dynamic partitioning columns in FileSink Operator

2015-10-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940137#comment-14940137
 ] 

Prasanth Jayachandran commented on HIVE-11972:
--

Looks much cleaner now. LGTM, +1

> [Refactor] Improve determination of dynamic partitioning columns in FileSink 
> Operator
> -
>
> Key: HIVE-11972
> URL: https://issues.apache.org/jira/browse/HIVE-11972
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11972.2.patch, HIVE-11972.3.patch, 
> HIVE-11972.4.patch, HIVE-11972.patch
>
>
> Currently it uses column names to locate DP columns, which is brittle since 
> column names may change during planning and optimization phases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12005) Remove hbase based stats collection mechanism

2015-10-01 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940155#comment-14940155
 ] 

Ashutosh Chauhan commented on HIVE-12005:
-

yup.. I indeed plan to remove JDBC as well.

> Remove hbase based stats collection mechanism
> -
>
> Key: HIVE-12005
> URL: https://issues.apache.org/jira/browse/HIVE-12005
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12005.patch
>
>
> Currently, hbase is one of the mechanism to collect and store statistics. I 
> have never come across anyone using it. FileSystem based collection mechanism 
> is default for few releases and is working well. We shall remove hbase stats 
> collector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12007) Hive LDAP Authenticator should allow just Domain without baseDN (for AD)

2015-10-01 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12007:
-
Attachment: HIVE-12007.patch

> Hive LDAP Authenticator should allow just Domain without baseDN (for AD)
> 
>
> Key: HIVE-12007
> URL: https://issues.apache.org/jira/browse/HIVE-12007
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12007.patch
>
>
> When the baseDN is not configured but only the Domain has been set in 
> hive-site.xml, LDAP Atn provider cannot locate the user in the directory. 
> Authentication fails in such cases. This is a change from the prior 
> implementation where the auth request succeeds based on being able to bind to 
> the directory. This has been called out in the design doc in HIVE-7193. 
> But we should allow this for now for backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12007) Hive LDAP Authenticator should allow just Domain without baseDN (for AD)

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940241#comment-14940241
 ] 

Szehon Ho commented on HIVE-12007:
--

Backward compatibility is important, +1

> Hive LDAP Authenticator should allow just Domain without baseDN (for AD)
> 
>
> Key: HIVE-12007
> URL: https://issues.apache.org/jira/browse/HIVE-12007
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12007.patch
>
>
> When the baseDN is not configured but only the Domain has been set in 
> hive-site.xml, LDAP Atn provider cannot locate the user in the directory. 
> Authentication fails in such cases. This is a change from the prior 
> implementation where the auth request succeeds based on being able to bind to 
> the directory. This has been called out in the design doc in HIVE-7193. 
> But we should allow this for now for backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12007) Hive LDAP Authenticator should allow just Domain without baseDN (for AD)

2015-10-01 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940267#comment-14940267
 ] 

Naveen Gangam commented on HIVE-12007:
--

code posted for review at https://reviews.apache.org/r/38936/

> Hive LDAP Authenticator should allow just Domain without baseDN (for AD)
> 
>
> Key: HIVE-12007
> URL: https://issues.apache.org/jira/browse/HIVE-12007
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12007.patch
>
>
> When the baseDN is not configured but only the Domain has been set in 
> hive-site.xml, LDAP Atn provider cannot locate the user in the directory. 
> Authentication fails in such cases. This is a change from the prior 
> implementation where the auth request succeeds based on being able to bind to 
> the directory. This has been called out in the design doc in HIVE-7193. 
> But we should allow this for now for backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12006) Enable Columnar Pushdown for RC/ORC File for HCatLoader

2015-10-01 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940108#comment-14940108
 ] 

Aihua Xu commented on HIVE-12006:
-

Hi Daniel, I already created the task HIVE-10755 for it and had the patch 
available. Can you take a look at that?

> Enable Columnar Pushdown for RC/ORC File for HCatLoader
> ---
>
> Key: HIVE-12006
> URL: https://issues.apache.org/jira/browse/HIVE-12006
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 1.2.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
>
> This initially enabled by HIVE-5193. However, HIVE-10752 reverted it since 
> there is issue in original implementation.
> We shall fix the issue an reenable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-11896) CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive default partition when inserting data

2015-10-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-11896.
-
Resolution: Fixed
  Assignee: Ashutosh Chauhan  (was: Pengcheng Xiong)

This is fixed via HIVE-11972

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive 
> default partition when inserting data
> --
>
> Key: HIVE-11896
> URL: https://issues.apache.org/jira/browse/HIVE-11896
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Ashutosh Chauhan
>
> To repro, run dynpart_sort_opt_vectorization.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11896) CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive default partition when inserting data

2015-10-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11896:

Fix Version/s: 2.0.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive 
> default partition when inserting data
> --
>
> Key: HIVE-11896
> URL: https://issues.apache.org/jira/browse/HIVE-11896
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Ashutosh Chauhan
> Fix For: 2.0.0
>
>
> To repro, run dynpart_sort_opt_vectorization.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-10-01 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940106#comment-14940106
 ] 

Daniel Dai commented on HIVE-10752:
---

Committed to branch-1.

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10752.patch
>
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12006) Enable Columnar Pushdown for RC/ORC File for HCatLoader

2015-10-01 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940125#comment-14940125
 ] 

Aihua Xu commented on HIVE-12006:
-

Thanks. I have that patch for a while, but haven't got it committed. Appreciate 
it if you can review it. :)

> Enable Columnar Pushdown for RC/ORC File for HCatLoader
> ---
>
> Key: HIVE-12006
> URL: https://issues.apache.org/jira/browse/HIVE-12006
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 1.2.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
>
> This initially enabled by HIVE-5193. However, HIVE-10752 reverted it since 
> there is issue in original implementation.
> We shall fix the issue an reenable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12004) SDPO doesnt set colExprMap correctly on new RS

2015-10-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940144#comment-14940144
 ] 

Prasanth Jayachandran commented on HIVE-12004:
--

LGTM, +1. Pending tests

> SDPO doesnt set colExprMap correctly on new RS
> --
>
> Key: HIVE-12004
> URL: https://issues.apache.org/jira/browse/HIVE-12004
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12004.patch
>
>
> As a result plan gets into a bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11980) Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940187#comment-14940187
 ] 

Szehon Ho commented on HIVE-11980:
--

Looks simple, +1

> Follow up on HIVE-11696, exception is thrown from CTAS from the table with 
> table-level serde is Parquet while partition-level serde is JSON
> ---
>
> Key: HIVE-11980
> URL: https://issues.apache.org/jira/browse/HIVE-11980
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11980.patch
>
>
> When we create a new table from the table with table-level serde to be 
> Parquet and partition-level serde to be JSON, currently the following 
> exception will be thrown if there are struct fields.
> Apparently, getStructFieldsDataAsList() also needs to handle the case of List 
> in addition to ArrayWritable similar to getStructFieldData.
> {noformat}
> Caused by: java.lang.UnsupportedOperationException: Cannot inspect 
> java.util.ArrayList
> at 
> org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241)
> at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-10-01 Thread Ratandeep Ratti (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940183#comment-14940183
 ] 

Ratandeep Ratti commented on HIVE-11878:


Hi [~jdere]
I got some time to look into this today.  I incorporated your suggestion 
where I create a fresh classloader when a new session is created. I use, as 
parent, the thread context classloader for the freshly created session 
classloader (See RB: https://reviews.apache.org/r/38663/) .  I have some doubts 
about using the thread context classloader as the parent.  This does not seem 
to provide clean isolation between jars/resources between different sessions.  
Case in point: a thread context classloader could be a previous session's 
classloader .This can happen when the same thread was used  to work on a 
previous session, and is now being used to work on the newer current session. 
The thread context classloaer  could contain a different implementation of the 
same class also present in the session classloader. Do you see this a a problem?


Another potential problem I'm thinking about -- which is present in the 
proposed approach (see RB) is --  in HiveServer2 any worker thread can serve 
any request by mapping it to a persistent session. Couldn't this lead to a 
situation where for a specific session the session specific classloader 
(conf.getClassLoader()) and the thread context classloader end up being  
different?  Say we have  two worker thread t1 and t2 .The  very first query is 
handled by t1 where a fresh session s1 is created along with a fresh 
classloader c1, which is  set as the session specific classloader and the 
thread context classloader. The next query for the same session is handled by 
t2. I guess since it is the same session s1, we do not create a fresh 
classloader. The session specific classloader is c1, but since it is a 
different thread and no classloader has been set on it, the thread will have 
the system classloader as its context classloader.  Couldn't this cause 
potential CNF exceptions?  If I understood correctly   this problem also exists 
in the current implementation, doesn't it?

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch

[jira] [Updated] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.

2015-10-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11914:
--
Component/s: HCatalog

> When transactions gets a heartbeat, it doesn't update the lock heartbeat.
> -
>
> Key: HIVE-11914
> URL: https://issues.apache.org/jira/browse/HIVE-11914
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
> associated locks.  This makes SHOW LOCKS confusing/misleading.
> This is especially visible in Streaming API use cases which use
> TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.

2015-10-01 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940380#comment-14940380
 ] 

Eugene Koifman commented on HIVE-11914:
---

this should have a test in TestStreaming which is easier after HIVE-11983 is 
committed

> When transactions gets a heartbeat, it doesn't update the lock heartbeat.
> -
>
> Key: HIVE-11914
> URL: https://issues.apache.org/jira/browse/HIVE-11914
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
> associated locks.  This makes SHOW LOCKS confusing/misleading.
> This is especially visible in Streaming API use cases which use
> TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12009) Add IMetastoreClient.heartbeat(long[] lockIds)

2015-10-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-12009.
---
Resolution: Won't Fix

> Add IMetastoreClient.heartbeat(long[] lockIds)
> --
>
> Key: HIVE-12009
> URL: https://issues.apache.org/jira/browse/HIVE-12009
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Current API only allows heart beating 1 lock ID (external ID) at a time.
> For multi statement txns we can have multiple locks per txn and should be 
> able to do it in 1 cal.
> Used from DbTxnManager.heartbeat()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12009) Add IMetastoreClient.heartbeat(long[] lockIds)

2015-10-01 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940386#comment-14940386
 ] 

Eugene Koifman commented on HIVE-12009:
---

actually this should not be necessary.  the only time you can have > 1 'ext' 
lock is when there is a transaction open, thus it's better to heartbeat the txn

> Add IMetastoreClient.heartbeat(long[] lockIds)
> --
>
> Key: HIVE-12009
> URL: https://issues.apache.org/jira/browse/HIVE-12009
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Current API only allows heart beating 1 lock ID (external ID) at a time.
> For multi statement txns we can have multiple locks per txn and should be 
> able to do it in 1 cal.
> Used from DbTxnManager.heartbeat()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Issue Type: Improvement  (was: Bug)

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
> ss_ticket_number (type:

[jira] [Commented] (HIVE-12003) Hive Streaming API : Add check to ensure table is transactional

2015-10-01 Thread Roshan Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940402#comment-14940402
 ] 

Roshan Naik commented on HIVE-12003:


i am revising this patch to exclude the -w option

> Hive Streaming API : Add check to ensure table is transactional
> ---
>
> Key: HIVE-12003
> URL: https://issues.apache.org/jira/browse/HIVE-12003
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Hive, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Attachments: HIVE-12003.patch
>
>
> Check if TBLPROPERTIES ('transactional'='true') is set when opening connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Attachment: HIVE-11969.02.patch

Updated

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.16.patch

Update patch to avoid conflicts

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.15.patch, HIVE-11642.16.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.

2015-10-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11914:
--
Description: 
TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
associated locks.  This makes SHOW LOCKS confusing/misleading.

This is especially visible in Streaming API use cases which use
TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) 

  was:TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
associated locks.  This makes SHOW LOCKS confusing/misleading.


> When transactions gets a heartbeat, it doesn't update the lock heartbeat.
> -
>
> Key: HIVE-11914
> URL: https://issues.apache.org/jira/browse/HIVE-11914
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
> associated locks.  This makes SHOW LOCKS confusing/misleading.
> This is especially visible in Streaming API use cases which use
> TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Affects Version/s: (was: 0.14.0)
   2.0.0

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
>

[jira] [Updated] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size

2015-10-01 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11720:

Attachment: HIVE-11720.4.patch

> Allow HiveServer2 to set custom http request/response header size
> -
>
> Key: HIVE-11720
> URL: https://issues.apache.org/jira/browse/HIVE-11720
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, 
> HIVE-11720.3.patch, HIVE-11720.4.patch, HIVE-11720.4.patch
>
>
> In HTTP transport mode, authentication information is sent over as part of 
> HTTP headers. Sometimes (observed when Kerberos is used) the default buffer 
> size for the headers is not enough, resulting in an HTTP 413 FULL head error. 
> We can expose those as customizable params.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11898) support default partition in metastoredirectsql

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11898:

Description: Right now, direct SQL intentionally skips the processing for 
default partition for PPD case; the SQL query fails and we fall back to JDO. 
Add support for default partition based on the same rules as JDO (don't return 
it)

> support default partition in metastoredirectsql
> ---
>
> Key: HIVE-11898
> URL: https://issues.apache.org/jira/browse/HIVE-11898
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11898.01.patch, HIVE-11898.02.patch, 
> HIVE-11898.patch
>
>
> Right now, direct SQL intentionally skips the processing for default 
> partition for PPD case; the SQL query fails and we fall back to JDO. Add 
> support for default partition based on the same rules as JDO (don't return it)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records

2015-10-01 Thread Roshan Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-11983:
---
Attachment: HIVE-11983.4.patch

Uploading v4 patch ..  found that the patch was not applying due to the above 
noted -w option.  This patch is created without the -w option and applies  
cleanly using 'patch -p0'  and 'git apply -p 0'
RB however still doesn't like it.

This  patch is on top of  commit SHA  24988f7 (HIVE-11972)

> Hive streaming API uses incorrect logic to assign buckets to incoming records
> -
>
> Key: HIVE-11983
> URL: https://issues.apache.org/jira/browse/HIVE-11983
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: streaming, streaming_api
> Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, HIVE-11983.patch
>
>
> The Streaming API tries to distribute records evenly into buckets. 
> All records in every Transaction that is part of TransactionBatch goes to the 
> same bucket and a new bucket number is chose for each TransactionBatch.
> Fix: API needs to hash each record to determine which bucket it belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Attachment: HIVE-9695.patch

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
> ss_ticket_number (type: int)
>

[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Attachment: HIVE-11969.02.patch

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Attachment: (was: HIVE-11969.02.patch)

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.

2015-10-01 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11914:
--
Attachment: HIVE-11914.patch

prelim patch

> When transactions gets a heartbeat, it doesn't update the lock heartbeat.
> -
>
> Key: HIVE-11914
> URL: https://issues.apache.org/jira/browse/HIVE-11914
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11914.patch
>
>
> TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the 
> associated locks.  This makes SHOW LOCKS confusing/misleading.
> This is especially visible in Streaming API use cases which use
> TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results

2015-10-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11517:

Fix Version/s: 1.0.0
   1.2.0

> Vectorized auto_smb_mapjoin_14.q produces different results
> ---
>
> Key: HIVE-11517
> URL: https://issues.apache.org/jira/browse/HIVE-11517
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.0.0, 1.2.0, 1.3.0, 2.0.0
>
> Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch
>
>
> Converted Q file to use ORC and turned on vectorization.
> The query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> produces 10 instead of 22.
> The query:
> {code}
> select src1.key, src1.cnt1, src2.cnt1 from
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq1 group by key
> ) src1
> join
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq2 group by key
> ) src2
> {code}
> produces:
> {code}
> 0 3   3
> 2 1   1
> 4 1   1
> 5 3   3
> 8 1   1
> 9 1   1
> {code}
> instead of:
> {code}
> 0 9   9
> 2 1   1
> 4 1   1
> 5 9   9
> 8 1   1
> 9 1   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results

2015-10-01 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940454#comment-14940454
 ] 

Matt McCline commented on HIVE-11517:
-

Added branch-1.0 and branch-1.2

> Vectorized auto_smb_mapjoin_14.q produces different results
> ---
>
> Key: HIVE-11517
> URL: https://issues.apache.org/jira/browse/HIVE-11517
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.0.0, 1.2.0, 1.3.0, 2.0.0
>
> Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch
>
>
> Converted Q file to use ORC and turned on vectorization.
> The query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> produces 10 instead of 22.
> The query:
> {code}
> select src1.key, src1.cnt1, src2.cnt1 from
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq1 group by key
> ) src1
> join
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq2 group by key
> ) src2
> {code}
> produces:
> {code}
> 0 3   3
> 2 1   1
> 4 1   1
> 5 3   3
> 8 1   1
> 9 1   1
> {code}
> instead of:
> {code}
> 0 9   9
> 2 1   1
> 4 1   1
> 5 9   9
> 8 1   1
> 9 1   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-10-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940485#comment-14940485
 ] 

Prasanth Jayachandran commented on HIVE-11928:
--

The test ran successfully but the results are not posted due to "403 Forbidden" 
error. Copy pasting the results here from
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/consoleFull
{code}
{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764451/HIVE-11928.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9640 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5485/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764451 - PreCommit-HIVE-TRUNK-Build
2015-10-01 13:16:51,702 ERROR JIRAService.postComment:176 Encountered error 
attempting to post comment to HIVE-11928 java.lang.RuntimeException: 403 
Forbidden
at 
org.apache.hive.ptest.execution.JIRAService.postComment(JIRAService.java:171)
at 
org.apache.hive.ptest.execution.PTest.publishJiraComment(PTest.java:242)
at org.apache.hive.ptest.execution.PTest.run(PTest.java:216)
at 
org.apache.hive.ptest.api.server.TestExecutor.run(TestExecutor.java:120)
{code}


> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928-branch-1.patch, HIVE-11928.1.patch, 
> HIVE-11928.1.patch, HIVE-11928.2.patch, HIVE-11928.2.patch, HIVE-11928.3.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11976) Extend CBO rules to being able to apply rules only once on a given operator

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940530#comment-14940530
 ] 

Szehon Ho commented on HIVE-11976:
--

Posting on behalf of HiveQA which was locked out:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764445/HIVE-11976.01.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9625 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-orc_merge6.q-vector_outer_join0.q-mapreduce1.q-and-12-more 
- did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5484/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5484/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5484/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

> Extend CBO rules to being able to apply rules only once on a given operator
> ---
>
> Key: HIVE-11976
> URL: https://issues.apache.org/jira/browse/HIVE-11976
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11976.01.patch, HIVE-11976.patch
>
>
> Create a way to bail out quickly from HepPlanner if the rule has been already 
> applied on a certain operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-10-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11928:
-
Attachment: HIVE-11928-branch-1.patch

Uploading new patch for branch-1

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928-branch-1.patch, HIVE-11928-branch-1.patch, 
> HIVE-11928.1.patch, HIVE-11928.1.patch, HIVE-11928.2.patch, 
> HIVE-11928.2.patch, HIVE-11928.3.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11973) IN operator fails when the column type is DATE

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940528#comment-14940528
 ] 

Szehon Ho commented on HIVE-11973:
--

Posting on behalf of HiveQA which was locked out:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764251/HIVE-11973.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9642 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5483/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5483/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5483/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

> IN operator fails when the column type is DATE 
> ---
>
> Key: HIVE-11973
> URL: https://issues.apache.org/jira/browse/HIVE-11973
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0
>Reporter: sanjiv singh
>Assignee: Yongzhi Chen
> Attachments: HIVE-11973.1.patch
>
>
> Test DLL :
> {code}
> CREATE TABLE `date_dim`(
>   `d_date_sk` int, 
>   `d_date_id` string, 
>   `d_date` date, 
>   `d_current_week` string, 
>   `d_current_month` string, 
>   `d_current_quarter` string, 
>   `d_current_year` string) ;
> {code}
> Hive query :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN ('2000-03-22','2001-03-22')  ;
> {code}
> In 1.0.0 ,  the above query fails with:
> {code}
> FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments 
> ''2001-03-22'': The arguments for IN should be the same type! Types are: 
> {date IN (string, string)}
> {code}
> I changed the query as given to pass the error :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) 
>  )  ;
> {code}
> But it works without casting  :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date   = '2000-03-22' ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records

2015-10-01 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940549#comment-14940549
 ] 

Eugene Koifman commented on HIVE-11983:
---

DelimitedInputorWriter:215 - why did you changes this to StringBuffer?
AbstractRecordWriter:80 - why change how class is loaded?
StrictJsonWriter: the 2 c'tor seem identical except for HiveConf.  Could 1st 
one use this(endPoint, null)?
getObjectInspectorsForBucketedCols() seems exactly the same as in 
DelimitedInputWriter
getBucketFields() - same as above

The write(long, byte[]) methods on the 2 writes: 1 calls reorderFields() the 
other does not.  Is that intentional?

TestStreaming
this has driver.run("set ") - Driver doesn't support "set" command, so all 
of these are guaranteed to fail.

> Hive streaming API uses incorrect logic to assign buckets to incoming records
> -
>
> Key: HIVE-11983
> URL: https://issues.apache.org/jira/browse/HIVE-11983
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: streaming, streaming_api
> Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, HIVE-11983.patch
>
>
> The Streaming API tries to distribute records evenly into buckets. 
> All records in every Transaction that is part of TransactionBatch goes to the 
> same bucket and a new bucket number is chose for each TransactionBatch.
> Fix: API needs to hash each record to determine which bucket it belongs to. 

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12011) unable to create temporary table using CTAS if regular table with that name already exists

2015-10-01 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12011:
---
Attachment: HIVE-12011.01.patch

> unable to create temporary table using CTAS if regular table with that name 
> already exists
> --
>
> Key: HIVE-12011
> URL: https://issues.apache.org/jira/browse/HIVE-12011
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12011.01.patch
>
>
> CTAS temporary table query fails if regular table with the same name already 
> exists. 
> Steps to reproduce the issue:
> {noformat}
> hive> use dbtemptable;
> OK
> Time taken: 0.273 seconds
> hive> create table a(i int);
> OK
> Time taken: 0.297 seconds
> hive> create temporary table a(i int);
> OK
> Time taken: 0.165 seconds
> hive> create table b(i int);
> OK
> Time taken: 0.212 seconds
> hive> create temporary table b as select * from a;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Table already exists: dbtemptable.b
> hive> create table c(i int);
> OK
> Time taken: 0.264 seconds
> hive> create temporary table b as select * from c;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Table already exists: dbtemptable.b
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12012) select query on json table with map type column fails

2015-10-01 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-12012:
--
Attachment: HIVE-12012.1.patch

JsonSerDe seems to only support map values of type string. Attaching patch and 
test cases.
[~sushanth], can you take a look? There do not seem to many tests for JsonSerDe 
and I want to make sure I'm not breaking any behavior here.

> select query on json table with map type column fails
> -
>
> Key: HIVE-12012
> URL: https://issues.apache.org/jira/browse/HIVE-12012
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jagruti Varia
>Assignee: Jason Dere
> Attachments: HIVE-12012.1.patch
>
>
> select query on json table throws this error if table contains map type 
> column:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> {noformat}
> steps to reproduce the issue:
> {noformat}
> hive> create table c_complex(a array,b map) row format 
> serde 'org.apache.hive.hcatalog.data.JsonSerDe';
> OK
> Time taken: 0.319 seconds
> hive> insert into table c_complex select array('aaa'),map('aaa',1) from 
> studenttab10k limit 2;
> Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id 
> application_1440603993714_0010)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  1  100   0  
>  0
> Reducer 2 ..   SUCCEEDED  1  100   0  
>  0
> 
> VERTICES: 02/02  [==>>] 100%  ELAPSED TIME: 11.75 s   
>  
> 
> Loading data to table default.c_complex
> Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, 
> rawDataSize=0]
> OK
> Time taken: 13.706 seconds
> hive> select * from c_complex;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> Time taken: 0.115 seconds
> hive> select count(*) from c_complex;
> OK
> 2
> Time taken: 0.205 seconds, Fetched: 1 row(s)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940534#comment-14940534
 ] 

Szehon Ho commented on HIVE-11786:
--

Posting on behalf of HiveQA which was locked out:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764455/HIVE-11786.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9640 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5487/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5487/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5487/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11786.1.patch, HIVE-11786.1.patch, 
> HIVE-11786.2.patch, HIVE-11786.patch
>
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11928) ORC footer and metadata section can also exceed protobuf message limit

2015-10-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11928:
-
Summary: ORC footer and metadata section can also exceed protobuf message 
limit  (was: ORC footer section can also exceed protobuf message limit)

> ORC footer and metadata section can also exceed protobuf message limit
> --
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11928-branch-1.patch, HIVE-11928-branch-1.patch, 
> HIVE-11928.1.patch, HIVE-11928.1.patch, HIVE-11928.2.patch, 
> HIVE-11928.2.patch, HIVE-11928.3.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940532#comment-14940532
 ] 

Szehon Ho commented on HIVE-11928:
--

Posting on behalf of HiveQA which was locked out:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764451/HIVE-11928.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9640 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5485/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11928-branch-1.patch, HIVE-11928-branch-1.patch, 
> HIVE-11928.1.patch, HIVE-11928.1.patch, HIVE-11928.2.patch, 
> HIVE-11928.2.patch, HIVE-11928.3.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11684) Implement limit pushdown through outer join in CBO

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940533#comment-14940533
 ] 

Szehon Ho commented on HIVE-11684:
--

Posting on behalf of HiveQA which was locked out:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764453/HIVE-11684.12.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9642 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5486/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5486/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5486/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

> Implement limit pushdown through outer join in CBO
> --
>
> Key: HIVE-11684
> URL: https://issues.apache.org/jira/browse/HIVE-11684
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, 
> HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, 
> HIVE-11684.07.patch, HIVE-11684.08.patch, HIVE-11684.09.patch, 
> HIVE-11684.10.patch, HIVE-11684.11.patch, HIVE-11684.12.patch, 
> HIVE-11684.12.patch, HIVE-11684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12012) select query on json table with map containing numeric values fails

2015-10-01 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-12012:
--
Summary: select query on json table with map containing numeric values 
fails  (was: select query on json table with map type column fails)

> select query on json table with map containing numeric values fails
> ---
>
> Key: HIVE-12012
> URL: https://issues.apache.org/jira/browse/HIVE-12012
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jagruti Varia
>Assignee: Jason Dere
> Attachments: HIVE-12012.1.patch
>
>
> select query on json table throws this error if table contains map type 
> column:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> {noformat}
> steps to reproduce the issue:
> {noformat}
> hive> create table c_complex(a array,b map) row format 
> serde 'org.apache.hive.hcatalog.data.JsonSerDe';
> OK
> Time taken: 0.319 seconds
> hive> insert into table c_complex select array('aaa'),map('aaa',1) from 
> studenttab10k limit 2;
> Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id 
> application_1440603993714_0010)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  1  100   0  
>  0
> Reducer 2 ..   SUCCEEDED  1  100   0  
>  0
> 
> VERTICES: 02/02  [==>>] 100%  ELAPSED TIME: 11.75 s   
>  
> 
> Loading data to table default.c_complex
> Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, 
> rawDataSize=0]
> OK
> Time taken: 13.706 seconds
> hive> select * from c_complex;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> Time taken: 0.115 seconds
> hive> select count(*) from c_complex;
> OK
> 2
> Time taken: 0.205 seconds, Fetched: 1 row(s)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12003) Hive Streaming API : Add check to ensure table is transactional

2015-10-01 Thread Roshan Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-12003:
---
Attachment: HIVE-12003.2.patch

> Hive Streaming API : Add check to ensure table is transactional
> ---
>
> Key: HIVE-12003
> URL: https://issues.apache.org/jira/browse/HIVE-12003
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Hive, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Attachments: HIVE-12003.2.patch, HIVE-12003.patch
>
>
> Check if TBLPROPERTIES ('transactional'='true') is set when opening connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size

2015-10-01 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940465#comment-14940465
 ] 

Thejas M Nair commented on HIVE-11720:
--

+1

> Allow HiveServer2 to set custom http request/response header size
> -
>
> Key: HIVE-11720
> URL: https://issues.apache.org/jira/browse/HIVE-11720
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, 
> HIVE-11720.3.patch, HIVE-11720.4.patch, HIVE-11720.4.patch
>
>
> In HTTP transport mode, authentication information is sent over as part of 
> HTTP headers. Sometimes (observed when Kerberos is used) the default buffer 
> size for the headers is not enough, resulting in an HTTP 413 FULL head error. 
> We can expose those as customizable params.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6859) Test JIRA

2015-10-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940518#comment-14940518
 ] 

Hive QA commented on HIVE-6859:
---

Test comment.

> Test JIRA
> -
>
> Key: HIVE-6859
> URL: https://issues.apache.org/jira/browse/HIVE-6859
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-6859.1.patch, HIVE-6859.2.patch, HIVE-6859.patch, 
> HIVE-6891.4.patch, HIVE-6891.5.patch, HIVE-6891.6.patch, HIVE-6891.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11995) Remove repetitively setting permissions in insert/load overwrite partition

2015-10-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940525#comment-14940525
 ] 

Szehon Ho commented on HIVE-11995:
--

Posting behalf of HiveQA which was locked out:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764495/HIVE-11995.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9625 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-script_pipe.q-mapjoin_decimal.q-transform_ppr2.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_skewtable
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5482/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5482/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5482/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

> Remove repetitively setting permissions in insert/load overwrite partition
> --
>
> Key: HIVE-11995
> URL: https://issues.apache.org/jira/browse/HIVE-11995
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11995.patch
>
>
> When hive.warehouse.subdir.inherit.perms is set to true, insert/load 
> overwrite .. partition set table and partition permissions repetitively which 
> is not necessary and causing performance issue especially in the cases where 
> there are multiple levels of partitions involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11923) allow qtests to run via a single client session for tez and llap

2015-10-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11923:
-
Attachment: HIVE-11923.2.patch

Another try for precommit test.

> allow qtests to run via a single client session for tez and llap
> 
>
> Key: HIVE-11923
> URL: https://issues.apache.org/jira/browse/HIVE-11923
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-11923.1.txt, HIVE-11923.2.branchllap.txt, 
> HIVE-11923.2.patch, HIVE-11923.2.txt, HIVE-11923.2.txt, 
> HIVE-11923.branch-1.txt
>
>
> Launching a new session - AM and containers for each test adds unnecessary 
> overheads. Running via a single session should reduce the run time 
> significantly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12012) select query on json table with map type column fails

2015-10-01 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940586#comment-14940586
 ] 

Jason Dere commented on HIVE-12012:
---

Looks like the same problem reported in HCATALOG-630

> select query on json table with map type column fails
> -
>
> Key: HIVE-12012
> URL: https://issues.apache.org/jira/browse/HIVE-12012
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jagruti Varia
>Assignee: Jason Dere
> Attachments: HIVE-12012.1.patch
>
>
> select query on json table throws this error if table contains map type 
> column:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> {noformat}
> steps to reproduce the issue:
> {noformat}
> hive> create table c_complex(a array,b map) row format 
> serde 'org.apache.hive.hcatalog.data.JsonSerDe';
> OK
> Time taken: 0.319 seconds
> hive> insert into table c_complex select array('aaa'),map('aaa',1) from 
> studenttab10k limit 2;
> Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id 
> application_1440603993714_0010)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  1  100   0  
>  0
> Reducer 2 ..   SUCCEEDED  1  100   0  
>  0
> 
> VERTICES: 02/02  [==>>] 100%  ELAPSED TIME: 11.75 s   
>  
> 
> Loading data to table default.c_complex
> Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, 
> rawDataSize=0]
> OK
> Time taken: 13.706 seconds
> hive> select * from c_complex;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> Time taken: 0.115 seconds
> hive> select count(*) from c_complex;
> OK
> 2
> Time taken: 0.205 seconds, Fetched: 1 row(s)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI

2015-10-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940455#comment-14940455
 ] 

Sergey Shelukhin commented on HIVE-11969:
-

Forgot to add the config flag, will do in next iteration

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12010) Tests should use FileSystem based stats collection mechanism

2015-10-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12010:

Attachment: HIVE-12010.patch

> Tests should use FileSystem based stats collection mechanism
> 
>
> Key: HIVE-12010
> URL: https://issues.apache.org/jira/browse/HIVE-12010
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12010.patch
>
>
> Although fs based collection mechanism is default for last few releases, 
> tests still use jdbc for stats collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization

2015-10-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939399#comment-14939399
 ] 

Hive QA commented on HIVE-11175:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12743388/HIVE-11175.1.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9640 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_create_func1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_udf_using
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_nonexistent_resource
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_udf_local_resource
org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithDfsResource
org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithLocalResource
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5481/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5481/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5481/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12743388 - PreCommit-HIVE-TRUNK-Build

> create function using jar does not work with sql std authorization
> --
>
> Key: HIVE-11175
> URL: https://issues.apache.org/jira/browse/HIVE-11175
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Olaf Flebbe
> Fix For: 2.0.0
>
> Attachments: HIVE-11175.1.patch
>
>
> {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} 
> gives error code for need of accessing a local foo.jar  resource with ADMIN 
> privileges. Same for HDFS (DFS_URI) .
> problem is that the semantic analysis enforces the ADMIN privilege for write 
> but the jar is clearly input not output. 
> Patch und Testcase appendend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940613#comment-14940613
 ] 

Hive QA commented on HIVE-11642:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764669/HIVE-11642.16.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9696 tests executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5488/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5488/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5488/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764669 - PreCommit-HIVE-TRUNK-Build

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.15.patch, HIVE-11642.16.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records

2015-10-01 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940619#comment-14940619
 ] 

Eugene Koifman commented on HIVE-11983:
---

also, in createRecordUpdater()
{noformat}
-  .statementId(-1) 

-  .finalDestination(partitionPath));   
{noformat}

is removed.  This is wrong - these lines must be there.

> Hive streaming API uses incorrect logic to assign buckets to incoming records
> -
>
> Key: HIVE-11983
> URL: https://issues.apache.org/jira/browse/HIVE-11983
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: streaming, streaming_api
> Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, HIVE-11983.patch
>
>
> The Streaming API tries to distribute records evenly into buckets. 
> All records in every Transaction that is part of TransactionBatch goes to the 
> same bucket and a new bucket number is chose for each TransactionBatch.
> Fix: API needs to hash each record to determine which bucket it belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess

2015-10-01 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940647#comment-14940647
 ] 

Daniel Dai commented on HIVE-10755:
---

The approach should be fine. We also need to avoid appending ids multiple times 
to conf to avoid unnecessary warning from 
ColumnProjectionUtils.getReadColumnIDs. We shall also add a test case which 
cause HIVE-10752. A simple join + foreach should reproduce HIVE-10752:
{code}
A = load '" + COMPLEX_TABLE + "' using 
org.apache.hive.hcatalog.pig.HCatLoader();
B = load '" + COMPLEX_TABLE + "' using 
org.apache.hive.hcatalog.pig.HCatLoader();
C = join A by name, B by name;
D = foreach C generate B::studentid;
{code}

> Rework on HIVE-5193 to enhance the column oriented table acess
> --
>
> Key: HIVE-10755
> URL: https://issues.apache.org/jira/browse/HIVE-10755
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.0.0
>
> Attachments: HIVE-10755.patch
>
>
> Add the support of column pruning for column oriented table access which was 
> done in HIVE-5193 but was reverted due to the join issue in HIVE-10720.
> In 1.3.0, the patch posted by Viray didn't work, probably due to some jar 
> reference. That seems to get fixed and that patch works in 2.0.0 now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12002) correct implementation typo

2015-10-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940696#comment-14940696
 ] 

Hive QA commented on HIVE-12002:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764468/HIVE-12002.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9641 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5489/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5489/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5489/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764468 - PreCommit-HIVE-TRUNK-Build

> correct implementation typo
> ---
>
> Key: HIVE-12002
> URL: https://issues.apache.org/jira/browse/HIVE-12002
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HCatalog, Metastore
>Affects Versions: 1.2.1
>Reporter: Alex Moundalexis
>Assignee: Alex Moundalexis
>Priority: Trivial
>  Labels: newbie, typo
> Attachments: HIVE-12002.patch
>
>
> The term "implemenation" is seen in HiveMetaScore INFO logs. Correcting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-10-01 Thread Ratandeep Ratti (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-11878:
---
Attachment: HIVE-11878_approach3_per_session_clasloader.patch

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_approach3_per_session_clasloader.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-10-01 Thread Ratandeep Ratti (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940705#comment-14940705
 ] 

Ratandeep Ratti commented on HIVE-11878:


s/above to/above two/

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_approach3_per_session_clasloader.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12015) LLAP: merge master into branch

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12015.
-
Resolution: Fixed

> LLAP: merge master into branch
> --
>
> Key: HIVE-12015
> URL: https://issues.apache.org/jira/browse/HIVE-12015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11973) IN operator fails when the column type is DATE

2015-10-01 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940690#comment-14940690
 ] 

Yongzhi Chen commented on HIVE-11973:
-

The 1 failure is not related. The age is more than 300.

> IN operator fails when the column type is DATE 
> ---
>
> Key: HIVE-11973
> URL: https://issues.apache.org/jira/browse/HIVE-11973
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0
>Reporter: sanjiv singh
>Assignee: Yongzhi Chen
> Attachments: HIVE-11973.1.patch
>
>
> Test DLL :
> {code}
> CREATE TABLE `date_dim`(
>   `d_date_sk` int, 
>   `d_date_id` string, 
>   `d_date` date, 
>   `d_current_week` string, 
>   `d_current_month` string, 
>   `d_current_quarter` string, 
>   `d_current_year` string) ;
> {code}
> Hive query :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN ('2000-03-22','2001-03-22')  ;
> {code}
> In 1.0.0 ,  the above query fails with:
> {code}
> FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments 
> ''2001-03-22'': The arguments for IN should be the same type! Types are: 
> {date IN (string, string)}
> {code}
> I changed the query as given to pass the error :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date  IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) 
>  )  ;
> {code}
> But it works without casting  :
> {code}
> SELECT *  
> FROM   date_dim 
> WHERE d_date   = '2000-03-22' ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-10-01 Thread Ratandeep Ratti (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940701#comment-14940701
 ] 

Ratandeep Ratti commented on HIVE-11878:


Also note: The above to problems, I think, should also exist in Hive currently. 
Am I missing something here?

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12013) LLAP: disable most llap tests before merge

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-12013:
---

Assignee: Sergey Shelukhin

> LLAP: disable most llap tests before merge
> --
>
> Key: HIVE-12013
> URL: https://issues.apache.org/jira/browse/HIVE-12013
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Tests cannot be parallelized before we merge, and tests cannot run fast 
> enough (they did once, so I guess cannot always run fast enough) when they 
> are not parallelized, and we need tests to pass before merging.
> LLAP is off by default, we did see tests pass recently (with the exception of 
> a few out file diffs), and some tests will still be run, so it should be ok 
> to proceed as follows.
> We will disable most of the LLAP q tests for now, merge, enable paralllelism 
> and re-enable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12013) LLAP: disable most llap tests before merge

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12013:

Description: 
Tests cannot be parallelized before we merge, and tests cannot run fast enough 
(they did once, so I guess cannot always run fast enough) when they are not 
parallelized, and we need tests to pass before merging.
LLAP is off by default, we did see tests pass recently (with the exception of a 
few out file diffs), and some tests will still be run, so it should be ok to 
proceed as follows.
We will disable most of the LLAP q tests for now, merge, enable paralllelism 
and re-enable.

  was:
Tests cannot be parallelized before we merge, and tests cannot run fast enough 
(they did once, so I guess cannot always run fast enough) when they are not 
parallelized.
LLAP is off by default, we did see tests pass recently (with the exception of a 
few out file diffs), and some tests will still be run, so it should be ok to 
proceed as follows.
We will disable most of the LLAP q tests for now, merge, enable paralllelism 
and re-enable.


> LLAP: disable most llap tests before merge
> --
>
> Key: HIVE-12013
> URL: https://issues.apache.org/jira/browse/HIVE-12013
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Tests cannot be parallelized before we merge, and tests cannot run fast 
> enough (they did once, so I guess cannot always run fast enough) when they 
> are not parallelized, and we need tests to pass before merging.
> LLAP is off by default, we did see tests pass recently (with the exception of 
> a few out file diffs), and some tests will still be run, so it should be ok 
> to proceed as follows.
> We will disable most of the LLAP q tests for now, merge, enable paralllelism 
> and re-enable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12013) LLAP: disable most llap tests before merge

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12013:

Description: 
Tests cannot be parallelized before we merge, and tests cannot run fast enough 
(they did once, so I guess cannot always run fast enough) when they are not 
parallelized.
LLAP is off by default, we did see tests pass recently (with the exception of a 
few out file diffs), and some tests will still be run, so it should be ok to 
proceed as follows.
We will disable most of the LLAP q tests for now, merge, enable paralllelism 
and re-enable.

  was:
Tests cannot be parallelized before we merge, and tests cannot run fast enough 
(they did once, so I guess cannot always run fast enough) when they are not 
parallelized.
LLAP is off by default, we did see tests pass recently (with the exception of a 
few out file diffs), and some tests will still be run. 
We will disable most of the LLAP q tests for now, merge, enable paralllelism 
and re-enable.


> LLAP: disable most llap tests before merge
> --
>
> Key: HIVE-12013
> URL: https://issues.apache.org/jira/browse/HIVE-12013
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Tests cannot be parallelized before we merge, and tests cannot run fast 
> enough (they did once, so I guess cannot always run fast enough) when they 
> are not parallelized.
> LLAP is off by default, we did see tests pass recently (with the exception of 
> a few out file diffs), and some tests will still be run, so it should be ok 
> to proceed as follows.
> We will disable most of the LLAP q tests for now, merge, enable paralllelism 
> and re-enable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10571) HiveMetaStoreClient should close existing thrift connection before its reconnect

2015-10-01 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940716#comment-14940716
 ] 

Lefty Leverenz commented on HIVE-10571:
---

Thanks [~ctang.ma].  My email message with that commit ID only shows master -- 
how strange.

The backports for 1.0.2 and 1.2.2 should also be listed in Fix Version/s so 
that release notes can pick up this jira when those versions are released.

> HiveMetaStoreClient should close existing thrift connection before its 
> reconnect
> 
>
> Key: HIVE-10571
> URL: https://issues.apache.org/jira/browse/HIVE-10571
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10571.patch, HIVE-10571.patch, HIVE-10571.patch
>
>
> HiveMetaStoreClient should first close its existing thrift connection, no 
> matter it is already dead or still live, before its opening another 
> connection in its reconnect() method. Otherwise, it might lead to resource 
> huge accumulation or leak at HMS site when client keeps on retrying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11765) SMB Join fails in Hive 1.2

2015-10-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940717#comment-14940717
 ] 

Prasanth Jayachandran commented on HIVE-11765:
--

I just tried in hive-1.2.1 release binary. Still unable to reproduce. Tried 
again with mr and tez. 

> SMB Join fails in Hive 1.2
> --
>
> Key: HIVE-11765
> URL: https://issues.apache.org/jira/browse/HIVE-11765
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Na Yang
>Assignee: Prasanth Jayachandran
> Attachments: employee (1).csv
>
>
> SMB join on Hive 1.2 fails with the following stack trace :
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:173)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
> ... 11 more
> Caused by: java.lang.IndexOutOfBoundsException: toIndex = 5
> at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
> at java.util.ArrayList.subList(ArrayList.java:996)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.getSchemaOnRead(RecordReaderFactory.java:161)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:66)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202)
> at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:230)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:163)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1104)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:67)
> {code}
> This error happens after adding the patch of HIVE-10591. Reverting HIVE-10591 
> fixes this exception. 
> Steps to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> create Table table1 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) INTO 384 buckets 
> stored as ORC;
> create Table table2 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) into 384 buckets 
> stored as ORC;
> create Table table_tmp (empID int, name varchar(64), email varchar(64), 
> company varchar(64), age int);
> load data local inpath '/tmp/employee.csv’ into table table_tmp;
> INSERT OVERWRITE table  table1 select * from table_tmp;
> INSERT OVERWRITE table  table2 select * from table_tmp;
> SELECT table1.age, table2.age from table1 inner join table2 on 
>

[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: (was: HIVE-11642.15.patch)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.17.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.17.patch

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.17.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12013) LLAP: disable most llap tests before merge

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12013.
-
   Resolution: Fixed
Fix Version/s: llap

> LLAP: disable most llap tests before merge
> --
>
> Key: HIVE-12013
> URL: https://issues.apache.org/jira/browse/HIVE-12013
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>
> Tests cannot be parallelized before we merge, and tests cannot run fast 
> enough (they did once, so I guess cannot always run fast enough) when they 
> are not parallelized, and we need tests to pass before merging.
> LLAP is off by default, we did see tests pass recently (with the exception of 
> a few out file diffs), and some tests will still be run, so it should be ok 
> to proceed as follows.
> We will disable most of the LLAP q tests for now, merge, enable paralllelism 
> and re-enable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: (was: HIVE-11642.15.patch)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.17.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: (was: HIVE-11642.16.patch)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.17.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-10-01 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940637#comment-14940637
 ] 

Rui Li commented on HIVE-11473:
---

Hi [~xuefuz], the latest test result is 
[here|http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/view/All/job/PreCommit-HIVE-SPARK-Build/lastBuild/].
{{parquet_join}} still fails. But it passes on my machine (using your updated 
tarball). Do we need to do some cleanup for the pre-commit test? Or would you 
mind try that test on your side? Thanks.

I also noticed snapshots of hive jars are uploaded 
[here|http://repository.apache.org/snapshots/org/apache/hive/]. We need to make 
sure to run {{mvn clean install -DskipTests -Phadoop-2}} under hive-home before 
the test, so that the test won't pick up a snapshot from external repo.

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.2-spark.patch, 
> HIVE-11473.3-spark.patch, HIVE-11473.3-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-10-01 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940698#comment-14940698
 ] 

Chaoyu Tang commented on HIVE-11786:


The test failure is not related to the patch.

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11786.1.patch, HIVE-11786.1.patch, 
> HIVE-11786.2.patch, HIVE-11786.patch
>
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Attachment: HIVE-11553.07.patch

HiveQA didn't pick it up... trying the same patch again

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.03.patch, HIVE-11553.04.patch, HIVE-11553.06.patch, 
> HIVE-11553.06.patch, HIVE-11553.07.patch, HIVE-11553.patch
>
>
> This is the first step; uses the simple footer-getting API, without PPD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-10-01 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11927:
---
Attachment: HIVE-11927.03.patch

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, 
> HIVE-11927.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.15.patch

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.15.patch, HIVE-11642.15.patch, HIVE-11642.16.patch, 
> HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-10-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940719#comment-14940719
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

I have patch long in the works but I keep getting distracted by other random 
crap.


> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939580#comment-14939580
 ] 

Jesus Camacho Rodriguez commented on HIVE-11634:


1. That's what I thought, no problem.
2. OK, this case should be solved or studied as part of this JIRA.
3. That's fine; we can create a new JIRA case for that. But maybe I would 
remove then the changes in PcrExprProcFactory.java and I would follow-up in the 
new JIRA, as that code is not working as expected. What do you think?
4. Maybe I didn't explain it properly. The idea is that you would only prepend 
non-partition columns, and without clustering them, but iff the NDV in the IN 
clause is reduced. In any case, we can create a new JIRA for this too, and 
maybe assign it to me? As I see it, the modification to the original 
optimization that you have just created should not be too complicated.

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11543) Provide log4j properties migration tool

2015-10-01 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940728#comment-14940728
 ] 

Prasanth Jayachandran commented on HIVE-11543:
--

https://issues.apache.org/jira/browse/LOG4J2-952

added properties file based configuration back to log4j2. This is available as 
part of recent log4j2 2.4 release. 
https://logging.apache.org/log4j/2.0/changes-report.html#a2.4

Need to investigate further if old properties files are still compatible. 

> Provide log4j properties migration tool
> ---
>
> Key: HIVE-11543
> URL: https://issues.apache.org/jira/browse/HIVE-11543
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Blocker
>
> For log4j2 migration, if users are performing upgrades then we need to 
> provide a tool for converting the existing log4j.properties file to 
> log4j2.xml file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12016) Update log4j2 version to 2.4

2015-10-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12016:
-
Attachment: HIVE-12016.1.patch

[~gopalv]/[~sershe] can someone take a look at this patch?

> Update log4j2 version to 2.4
> 
>
> Key: HIVE-12016
> URL: https://issues.apache.org/jira/browse/HIVE-12016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12016.1.patch
>
>
> The latest 2.4 release of log4j2 brought back properties file based 
> configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4
> bump up the version number to 2.4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11997) Add ability to send Compaction Jobs to specific queue

2015-10-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940773#comment-14940773
 ] 

Hive QA commented on HIVE-11997:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764492/HIVE-11997.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9641 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5492/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5492/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5492/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764492 - PreCommit-HIVE-TRUNK-Build

> Add ability to send Compaction Jobs to specific queue
> -
>
> Key: HIVE-11997
> URL: https://issues.apache.org/jira/browse/HIVE-11997
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11997.patch
>
>
> need new HiveConf param to specify queue name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12016) Update log4j2 version to 2.4

2015-10-01 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12016:
-
Attachment: HIVE-12016.2.patch

> Update log4j2 version to 2.4
> 
>
> Key: HIVE-12016
> URL: https://issues.apache.org/jira/browse/HIVE-12016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12016.1.patch, HIVE-12016.2.patch
>
>
> The latest 2.4 release of log4j2 brought back properties file based 
> configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4
> bump up the version number to 2.4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11954:
---
Attachment: HIVE-11954.02.patch

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11445:
---
Affects Version/s: 2.0.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby 
> distinct does not work
> -
>
> Key: HIVE-11445
> URL: https://issues.apache.org/jira/browse/HIVE-11445
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11445.01.patch, HIVE-11445.02.patch, 
> HIVE-11445.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

97 matches

Mail list logo