[jira] [Commented] (HIVE-13141) Hive on Spark over HBase should accept parameters starting with "zookeeper.znode"

2016-03-18 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202610#comment-15202610
 ] 

Szehon Ho commented on HIVE-13141:
--

[~jxiang] might know more about this

> Hive on Spark over HBase should accept parameters starting with 
> "zookeeper.znode"
> -
>
> Key: HIVE-13141
> URL: https://issues.apache.org/jira/browse/HIVE-13141
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-13141.patch
>
>
> HBase related paramters has been added by HIVE-12708.
> Following the same way,parameters starting with "zookeeper.znode" should be 
> add too,which are also HBase related paramters .
> Refering to http://blog.cloudera.com/blog/2013/10/what-are-hbase-znodes/
> I have seen a failure with Hive on Spark over HBase  due to customize 
> zookeeper.znode.parent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13217) Replication for HoS mapjoin small file needs to respect dfs.replication.max

2016-03-18 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202608#comment-15202608
 ] 

Szehon Ho commented on HIVE-13217:
--

Sorry you are right, it is min(), latest patch looks good to me +1.  Thanks!

> Replication for HoS mapjoin small file needs to respect dfs.replication.max
> ---
>
> Key: HIVE-13217
> URL: https://issues.apache.org/jira/browse/HIVE-13217
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Szehon Ho
>Assignee: Chinna Rao Lalam
>Priority: Minor
> Attachments: HIVE-13217.1.patch, HIVE-13217.2.patch
>
>
> Currently Hive on Spark Mapjoin replicates small table file to a hard-coded 
> value of 10.  See SparkHashTableSinkOperator.MIN_REPLICATION. 
> When dfs.replication.max is less than 10, HoS query fails.  This constant 
> should cap at dfs.replication.max.
> Normally dfs.replication.max seems set at 512.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13217) Replication for HoS mapjoin small file needs to respect dfs.replication.max

2016-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202605#comment-15202605
 ] 

Hive QA commented on HIVE-13217:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794134/HIVE-13217.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9835 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7309/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7309/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7309/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12794134 - PreCommit-HIVE-TRUNK-Build

> Replication for HoS mapjoin small file needs to respect dfs.replication.max
> ---
>
> Key: HIVE-13217
> URL: https://issues.apache.org/jira/browse/HIVE-13217
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Szehon Ho
>Assignee: Chinna Rao Lalam
>Priority: Minor
> Attachments: HIVE-13217.1.patch, HIVE-13217.2.patch
>
>
> Currently Hive on Spark Mapjoin replicates small table file to a hard-coded 
> value of 10.  See SparkHashTableSinkOperator.MIN_REPLICATION. 
> When dfs.replication.max is less than 10, HoS query fails.  This constant 
> should cap at dfs.replication.max.
> Normally dfs.replication.max seems set at 512.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on sprak

2016-03-18 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198647#comment-15198647
 ] 

Xuefu Zhang commented on HIVE-13293:


[~lirui], that sounds like a viable optimization. Yes, please do some research 
around that. Thanks.

> Query occurs performance degradation after enabling parallel order by for 
> Hive on sprak
> ---
>
> Key: HIVE-13293
> URL: https://issues.apache.org/jira/browse/HIVE-13293
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
>
> I use TPCx-BB to do some performance test on Hive on Spark engine. And found 
> query 10 has performance degradation when enabling parallel order by.
> It seems that sampling cost much time before running the real query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13298) nested join support causes undecipherable errors in SemanticAnalyzer

2016-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202415#comment-15202415
 ] 

Ashutosh Chauhan commented on HIVE-13298:
-

+1

> nested join support causes undecipherable errors in SemanticAnalyzer
> 
>
> Key: HIVE-13298
> URL: https://issues.apache.org/jira/browse/HIVE-13298
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13298.01.patch, HIVE-13298.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13309) Create View with nested struct array struct array failed, but can create table successfly.

2016-03-18 Thread yuzhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201028#comment-15201028
 ] 

yuzhou commented on HIVE-13309:
---

Thx for your response. It's just a normal query. I generate this query from a 
protobuf file. I stored my protobuf messages using parquet format in hive. And 
the protobuf has nested fields. There's no problem in create table and create 
view when my nest is struct -> array -> struct. but create view failed when 
struct -> array -> struct -> array -> struct.

> Create View with nested struct array struct array failed, but can create 
> table successfly.
> --
>
> Key: HIVE-13309
> URL: https://issues.apache.org/jira/browse/HIVE-13309
> Project: Hive
>  Issue Type: Bug
>  Components: Views
>Affects Versions: 1.1.0
>Reporter: yuzhou
>
> I can successful create table with a field name,which type is struct, the 
> struct contains a field type is array, the array's item is struct, and this 
> struct contains array. but I can not create a view from this table, Operator 
> is only supported on struct or list of struct types. But I can create view 
> when the field is struct, it contains a sub field is array, the array 
> contains struct, and the struct do not have array field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-18 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200051#comment-15200051
 ] 

Aihua Xu commented on HIVE-13286:
-

Attached patch-2: fix the unit test.

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13295) Improvement to LDAP search queries in HS2 LDAP Authenticator

2016-03-18 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-13295:
-
Attachment: HIVE-13295.1.patch

> Improvement to LDAP search queries in HS2 LDAP Authenticator
> 
>
> Key: HIVE-13295
> URL: https://issues.apache.org/jira/browse/HIVE-13295
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13295.1.patch
>
>
> As more usecases, for various LDAP flavors and deployments, emerge, Hive's 
> LDAP authentication provider needs additional configuration properties to 
> make it more flexible to work with different LDAP deployments.
> For example:
> 1) Not every LDAP server supports a "memberOf" property on user entries that 
> refer to the groups the user belongs to. This attribute is used for group 
> filter support. So instead of relying on this attribute to be set, we can 
> reverse the search and find all the groups that have an attribute, that 
> refers to its members, set. For example "member" or "memberUid" etc.
> Since this atttribute name differs from ldap to ldap, its best we make this 
> configurable, with a default value of "member"
> 2) In HIVE-12885, a new property was introduced to make the attribute for an 
> user/group search key user-configurable instead of assuming its "uid" (when 
> baseDN is set) or "cn" (otherwise). This change was deferred from the initial 
> patch.
> 3) LDAP Groups can have various ObjectClass'es. For example objectClass=group 
> or objectClass=groupOfNames or objectClass=posixGroup or 
> objectClass=groupOfUniqueNames etc. There could be other we dont know of.
> So we need a property to make this user-configurable with a certain default. 
> 4) There is also a bug where the lists for groupFilter and userFilter are not 
> re-initialized each time init() is called.
> These lists are only re-initialized if the new HiveConf has userFilter or 
> groupFilter set values. Otherwise, the provider will use values from previous 
> initialization.
> I found this bug when writing some new tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session

2016-03-18 Thread Vinoth Sathappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Sathappan updated HIVE-12977:

Description: 
The credentials present in the current UGI i.e. 
UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the Tez 
session. It is instantiated with null credentials. 

session = TezClient.create("HIVE-" + sessionId, tezConfig, true,
commonLocalResources, null);

In this case, tokens added using hive execution hooks, aren't available to Tez 
even if they are available in memory.


  was:
The credentials present in the current UGI i.e. 
UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the Tez 
session. It is instantiated with null credentials 

session = TezClient.create("HIVE-" + sessionId, tezConfig, true,
commonLocalResources, null);

In this case, Tez fails to access resources even if the tokens are available in 
memory.


> Pass credentials in the current UGI while creating Tez session
> --
>
> Key: HIVE-12977
> URL: https://issues.apache.org/jira/browse/HIVE-12977
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vinoth Sathappan
>Assignee: Vinoth Sathappan
> Attachments: HIVE-12977.1.patch, HIVE-12977.1.patch
>
>
> The credentials present in the current UGI i.e. 
> UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the 
> Tez session. It is instantiated with null credentials. 
> session = TezClient.create("HIVE-" + sessionId, tezConfig, true,
> commonLocalResources, null);
> In this case, tokens added using hive execution hooks, aren't available to 
> Tez even if they are available in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201223#comment-15201223
 ] 

Gopal V commented on HIVE-13310:


LGTM - +1.

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13310.01.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2016-03-18 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4570:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> More information to user on GetOperationStatus in Hive Server2 when query is 
> still executing
> 
>
> Key: HIVE-4570
> URL: https://issues.apache.org/jira/browse/HIVE-4570
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-4570.01.patch, HIVE-4570.01.patch, 
> HIVE-4570.02.patch, HIVE-4570.03.patch, HIVE-4570.03.patch, 
> HIVE-4570.04.patch, HIVE-4570.04.patch, HIVE-4570.06.patch, HIVE-4570.07.patch
>
>
> Currently in Hive Server2, when the query is still executing only the status 
> is set as STILL_EXECUTING. 
> This issue is to give more information to the user such as progress and 
> running job handles, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13286) Query ID is being reused across queries

2016-03-18 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13286:

Attachment: HIVE-13286.4.patch

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, 
> HIVE-13286.3.patch, HIVE-13286.4.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13299) Column Names trimmed of leading and trailing spaces

2016-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200744#comment-15200744
 ] 

Ashutosh Chauhan commented on HIVE-13299:
-

+1 pending tests
 Can you also add {{describe formatted  space}} in test case.

> Column Names trimmed of leading and trailing spaces
> ---
>
> Key: HIVE-13299
> URL: https://issues.apache.org/jira/browse/HIVE-13299
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13299.01.patch
>
>
> PROBLEM:
> As per the Hive Language DDL: 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
> In Hive 0.12 and earlier, only alphanumeric and underscore characters are 
> allowed in table and column names.
> In Hive 0.13 and later, column names can contain any Unicode character (see 
> HIVE-6013). Any column name that is specified within backticks (`) is treated 
> literally.
> However column names
> {code}
> ` left` resulted in `left`
> ` middle ` resulted in `middle`
> `right ` resulted in `right`
> `middle space` resulted in `middle space`
> ` middle space ` resulted in `middle space`
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13311) MetaDataFormatUtils throws NPE when HiveDecimal.create is null

2016-03-18 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201737#comment-15201737
 ] 

Sergio Peña commented on HIVE-13311:


Thanks [~sircodesalot]
+1

> MetaDataFormatUtils throws NPE when HiveDecimal.create is null
> --
>
> Key: HIVE-13311
> URL: https://issues.apache.org/jira/browse/HIVE-13311
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
>Priority: Minor
> Attachments: HIVE-13311.01.patch
>
>
> The {{MetadataFormatUtils.convertToString}} functions have guards to validate 
> for when valid is null, however the {{HiveDecimal.create}} can return null 
> and will throw exceptions when {{.toString()}} is called.
> {code}
>   private static String convertToString(Decimal val) {
> if (val == null) {
>   return "";
> }
> // HERE: Will throw NPE when HiveDecimal.create returns null.
> return HiveDecimal.create(new BigInteger(val.getUnscaled()), 
> val.getScale()).toString();
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-18 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197836#comment-15197836
 ] 

Jesus Camacho Rodriguez commented on HIVE-11424:


[~ashutoshc], [~gopalv], thanks for the feedback. I'm working on a new version 
of the patch to address those issues.

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.04.patch, 
> HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses (when 
> possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13291) ORC BI Split strategy should consider block size instead of file size

2016-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200362#comment-15200362
 ] 

Hive QA commented on HIVE-13291:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793691/HIVE-13291.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9818 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7296/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7296/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7296/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793691 - PreCommit-HIVE-TRUNK-Build

> ORC BI Split strategy should consider block size instead of file size
> -
>
> Key: HIVE-13291
> URL: https://issues.apache.org/jira/browse/HIVE-13291
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13291.1.patch, HIVE-13291.2.patch, 
> HIVE-13291.3.patch
>
>
> When we force split strategy to use "BI" (using 
> hive.exec.orc.split.strategy), entire file is considered as single split. 
> This might be inefficient when the files are large. Instead, BI should 
> consider splitting at block boundary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns

2016-03-18 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13125:
---
Status: Open  (was: Patch Available)

> Support masking and filtering of rows/columns
> -
>
> Key: HIVE-13125
> URL: https://issues.apache.org/jira/browse/HIVE-13125
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch, 
> HIVE-13125.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13307) LLAP: Slider package should contain permanent functions

2016-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200930#comment-15200930
 ] 

Sergey Shelukhin edited comment on HIVE-13307 at 3/18/16 3:18 AM:
--

try {
+  hive.getMSC();
+} catch (MetaException e) {
+  throw new HiveException(e);
+}
should be unnecessary, that's the default. Also can the dl code be shared if 
it's not too much trouble? 

Nm I see static checker used. Is it supposed to always be there, regardless of 
the new setting?


was (Author: sershe):
try {
+  hive.getMSC();
+} catch (MetaException e) {
+  throw new HiveException(e);
+}
should be unnecessary, that's the default. Also can the dl code be shared if 
it's not too much trouble? Is this patch incomplete? I don't see static checker 
used.

> LLAP: Slider package should contain permanent functions
> ---
>
> Key: HIVE-13307
> URL: https://issues.apache.org/jira/browse/HIVE-13307
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.1
> Attachments: HIVE-13307.1.patch
>
>
> This renames a previous configuration option
> hive.llap.daemon.allow.permanent.fns -> 
> hive.llap.daemon.download.permanent.fns
> and adds a new parameter for LlapDecider
> hive.llap.allow.permanent.fns
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13299) Column Names trimmed of leading and trailing spaces

2016-03-18 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13299:
---
Status: Patch Available  (was: Open)

> Column Names trimmed of leading and trailing spaces
> ---
>
> Key: HIVE-13299
> URL: https://issues.apache.org/jira/browse/HIVE-13299
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13299.01.patch
>
>
> PROBLEM:
> As per the Hive Language DDL: 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
> In Hive 0.12 and earlier, only alphanumeric and underscore characters are 
> allowed in table and column names.
> In Hive 0.13 and later, column names can contain any Unicode character (see 
> HIVE-6013). Any column name that is specified within backticks (`) is treated 
> literally.
> However column names
> {code}
> ` left` resulted in `left`
> ` middle ` resulted in `middle`
> `right ` resulted in `right`
> `middle space` resulted in `middle space`
> ` middle space ` resulted in `middle space`
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-18 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200317#comment-15200317
 ] 

Vikram Dixit K commented on HIVE-13286:
---

[~aihuaxu] I think the bug still exists here. I see that once I set a query id, 
it never changes. I think you need the following change in the Driver class as 
well:

{code}
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
index 7327a42..1fac526 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
@@ -403,13 +403,7 @@ public int compile(String command, boolean resetTaskIds) {
 }
 saveSession(queryState);

-// Generate new query id if it's not set for CLI case. If it's session 
based,
-// query id is passed in from the client or initialized when the session 
starts.
-String queryId = conf.getVar(HiveConf.ConfVars.HIVEQUERYID);
-if (queryId == null || queryId.isEmpty()) {
-  queryId = QueryPlan.makeQueryId();
-  conf.setVar(HiveConf.ConfVars.HIVEQUERYID, queryId);
-}
+conf.setVar(HiveConf.ConfVars.HIVEQUERYID, QueryPlan.makeQueryId());

 //save some info for webUI for use after plan is freed
 this.queryDisplay.setQueryStr(queryStr);

{code}

I ran a test with a lot more queries than earlier and it turned out that the 
query id did not change.

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13240) GroupByOperator: Drop the hash aggregates when closing operator

2016-03-18 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13240:
---
Status: Patch Available  (was: Open)

> GroupByOperator: Drop the hash aggregates when closing operator
> ---
>
> Key: HIVE-13240
> URL: https://issues.apache.org/jira/browse/HIVE-13240
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0, 1.2.1, 1.3.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13240.1.patch, HIVE-13240.2.patch
>
>
> GroupByOperator holds onto the Hash aggregates accumulated when the plan is 
> cached.
> Drop the hashAggregates in case of error during forwarding to the next 
> operator.
> Added for PTF, TopN and all GroupBy cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13264) JDBC driver makes 2 Open Session Calls for every open session

2016-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199640#comment-15199640
 ] 

Hive QA commented on HIVE-13264:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793899/HIVE-13264.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7294/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7294/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7294/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7294/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at a17122f HIVE-13285: Orc concatenation may drop old files from 
moving to final path (Prasanth Jayachandran reviewed by Gopal V)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at a17122f HIVE-13285: Orc concatenation may drop old files from 
moving to final path (Prasanth Jayachandran reviewed by Gopal V)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793899 - PreCommit-HIVE-TRUNK-Build

> JDBC driver makes 2 Open Session Calls for every open session
> -
>
> Key: HIVE-13264
> URL: https://issues.apache.org/jira/browse/HIVE-13264
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: NITHIN MAHESH
>Assignee: NITHIN MAHESH
>  Labels: jdbc
> Attachments: HIVE-13264.1.patch, HIVE-13264.patch
>
>
> When HTTP is used as the transport mode by the Hive JDBC driver, we noticed 
> that there is an additional open/close session just to validate the 
> connection. 
>  
> TCLIService.Iface client = new TCLIService.Client(new 
> TBinaryProtocol(transport));
>   TOpenSessionResp openResp = client.OpenSession(new TOpenSessionReq());
>   if (openResp != null) {
> client.CloseSession(new 
> TCloseSessionReq(openResp.getSessionHandle()));
>   }
>  
> The open session call is a costly one and should not be used to test 
> transport. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13241) LLAP: Incremental Caching marks some small chunks as "incomplete CB"

2016-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202351#comment-15202351
 ] 

Sergey Shelukhin commented on HIVE-13241:
-

[~gopalv] [~prasanth_j] can you review? https://reviews.apache.org/r/45062/

> LLAP: Incremental Caching marks some small chunks as "incomplete CB"
> 
>
> Key: HIVE-13241
> URL: https://issues.apache.org/jira/browse/HIVE-13241
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13241.01.patch, HIVE-13241.patch
>
>
> Run #3 of a query with 1 node still has cache misses.
> {code}
> LLAP IO Summary
> --
>   VERTICES ROWGROUPS  META_HIT  META_MISS  DATA_HIT  DATA_MISS  ALLOCATION
>  USED  TOTAL_IO
> --
>  Map 111  1116  01.65GB93.61MB  0B
>0B32.72s
> --
> {code}
> {code}
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking 
> 0x1c44401d(1) due to reuse
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an 
> already-uncompressed buffer 0x1c44401d(2)
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking 
> 0x4e51b032(1) due to reuse
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an 
> already-uncompressed buffer 0x4e51b032(2)
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:addOneCompressionBuffer(1161)) - Found CB at 1373931, 
> chunk length 86587, total 86590, compressed
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:addIncompleteCompressionBuffer(1241)) - Replacing 
> data range [1373931, 1408408), size: 34474(!) type: direct (and 0 previous 
> chunks) with incomplete CB start: 1373931 end: 1408408 in the buffers
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:createRgColumnStreamData(441)) - Getting data for 
> column 7 RG 14 stream DATA at 1460521, 319811 index position 0: compressed 
> [1626961, 1780332)
> {code}
> {code}
> 2016-03-08T21:05:38,925 INFO  
> [IO-Elevator-Thread-7[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.OrcEncodedDataReader (OrcEncodedDataReader.java:readFileData(878)) - 
> Disk ranges after disk read (file 5372745, base offset 3): [{start: 18986 
> end: 20660 cache buffer: 0x660faf7c(1)}, {start: 20660 end: 35775 cache 
> buffer: 0x1dcb1d97(1)}, {start: 318852 end: 422353 cache buffer: 
> 0x6c7f9a05(1)}, {start: 1148616 end: 1262468 cache buffer: 0x196e1d41(1)}, 
> {start: 1262468 end: 1376342 cache buffer: 0x201255f(1)}, {data range 
> [1376342, 1410766), size: 34424 type: direct}, {start: 1631359 end: 1714694 
> cache buffer: 0x47e3a72d(1)}, {start: 1714694 end: 1785770 cache buffer: 
> 0x57dca266(1)}, {start: 4975035 end: 5095215 cache buffer: 0x3e3139c9(1)}, 
> {start: 5095215 end: 5197863 cache buffer: 0x3511c88d(1)}, {start: 7448387 
> end: 7572268 cache buffer: 0x6f11dbcd(1)}, {start: 7572268 end: 7696182 cache 
> buffer: 0x5d6c9bdb(1)}, {data range [7696182, 7710537), size: 14355 type: 
> direct}, {start: 8235756 end: 8345367 cache buffer: 0x6a241ece(1)}, {start: 
> 8345367 end: 8455009 cache buffer: 0x51caf6a7(1)}, {data range [8455009, 
> 8497906), size: 42897 type: direct}, {start: 9035815 end: 9159708 cache 
> buffer: 0x306480e0(1)}, {start: 9159708 end: 9283629 cache buffer: 
> 0x9ef7774(1)}, {data range [9283629, 9297965), size: 14336 type: direct}, 
> {start: 9989884 end: 10113731 cache buffer: 0x43f7cae9(1)}, {start: 10113731 
> end: 10237589 cache buffer: 0x458e63fe(1)}, {data range [10237589, 10252034), 
> size: 14445 type: direct}, {start: 11897896 end: 12021787 cache buffer: 
> 0x51f9982f(1)}, {start: 12021787 end: 12145656 cache buffer: 0x23df01b3(1)}, 
> {da

[jira] [Commented] (HIVE-13295) Improvement to LDAP search queries in HS2 LDAP Authenticator

2016-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201603#comment-15201603
 ] 

Hive QA commented on HIVE-13295:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793841/HIVE-13295.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9836 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7303/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7303/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7303/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793841 - PreCommit-HIVE-TRUNK-Build

> Improvement to LDAP search queries in HS2 LDAP Authenticator
> 
>
> Key: HIVE-13295
> URL: https://issues.apache.org/jira/browse/HIVE-13295
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13295.1.patch
>
>
> As more usecases, for various LDAP flavors and deployments, emerge, Hive's 
> LDAP authentication provider needs additional configuration properties to 
> make it more flexible to work with different LDAP deployments.
> For example:
> 1) Not every LDAP server supports a "memberOf" property on user entries that 
> refer to the groups the user belongs to. This attribute is used for group 
> filter support. So instead of relying on this attribute to be set, we can 
> reverse the search and find all the groups that have an attribute, that 
> refers to its members, set. For example "member" or "memberUid" etc.
> Since this atttribute name differs from ldap to ldap, its best we make this 
> configurable, with a default value of "member"
> 2) In HIVE-12885, a new property was introduced to make the attribute for an 
> user/group search key user-configurable instead of assuming its "uid" (when 
> baseDN is set) or "cn" (otherwise). This change was deferred from the initial 
> patch.
> 3) LDAP Groups can have various ObjectClass'es. For example objectClass=group 
> or objectClass=groupOfNames or objectClass=posixGroup or 
> objectClass=groupOfUniqueNames etc. There could be other we dont know of.
> So we need a property to make this user-configurable with a certain default. 
> 4) There is also a bug where the lists for groupFilter and userFilter are not 
> re-initialized each time init() is called.
> These lists are only re-initialized if the new HiveConf has userFilter or 
> groupFilter set values. Otherwise, the provider will use values from previous 
> initialization.
> I found this bug when writing some new tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13307) LLAP: Slider package should contain permanent functions

2016-03-18 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13307:
---
Description: 
This renames a previous configuration option

hive.llap.daemon.allow.permanent.fns -> hive.llap.daemon.download.permanent.fns

and adds a new parameter for LlapDecider

hive.llap.allow.permanent.fns

NO PRECOMMIT TESTS


  was:
This renames a previous configuration option


NO PRECOMMIT TESTS



> LLAP: Slider package should contain permanent functions
> ---
>
> Key: HIVE-13307
> URL: https://issues.apache.org/jira/browse/HIVE-13307
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.1
> Attachments: HIVE-13307.1.patch
>
>
> This renames a previous configuration option
> hive.llap.daemon.allow.permanent.fns -> 
> hive.llap.daemon.download.permanent.fns
> and adds a new parameter for LlapDecider
> hive.llap.allow.permanent.fns
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13305) LlapInputFormat should get LLAP ports from the LLAP service registry

2016-03-18 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere resolved HIVE-13305.
---
   Resolution: Fixed
Fix Version/s: llap

committed to llap branch

> LlapInputFormat should get LLAP ports from the LLAP service registry
> 
>
> Key: HIVE-13305
> URL: https://issues.apache.org/jira/browse/HIVE-13305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: llap
>
>
> LlapInputFormat currently gets the LLAP ports from the HiveConf. This should 
> really be querying the LLAP service registry to get the correct ports.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system

2016-03-18 Thread Aleksey Vovchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Vovchenko updated HIVE-13279:
-
Status: Patch Available  (was: In Progress)

> SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's 
> file system
> --
>
> Key: HIVE-13279
> URL: https://issues.apache.org/jira/browse/HIVE-13279
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0
>Reporter: Aleksey Vovchenko
>Assignee: Aleksey Vovchenko
> Fix For: 2.0.1
>
> Attachments: HIVE-13279.patch
>
>
> h2. STEP 1. Create test Tables
> Execute in command line:
> {noformat} 
> nano test.data
> {noformat} 
> Add to file:
> {noformat}
> 1,aa
> 2,aa
> 3,ff
> 4,sad
> 5,adsf
> 6,adsf
> 7,affss
> {noformat}
> {noformat}
> hadoop fs -put test.data /
> {noformat} 
> {noformat}
> hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> hive> create table ptest(x int, y string) partitioned by(z string); 
> hive> LOAD DATA  INPATH '/test.data' OVERWRITE INTO TABLE test;
> hive> insert overwrite table ptest partition(z=65) select * from test;
> hive> insert overwrite table ptest partition(z=67) select * from test;
> {noformat}
> h2. STEP 2. Compare lastUpdateTime
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different.
> h2. STEP 3. Put data into hdfs and compare lastUpdateTime
> Execute in command line:
> {noformat}
> hadoop fs -put test.data /user/hive/warehouse/ptest
> {noformat}
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different but they are same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4662) first_value can't have more than one order by column

2016-03-18 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-4662:
--
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for the review [~ashutoshc]!

> first_value can't have more than one order by column
> 
>
> Key: HIVE-4662
> URL: https://issues.apache.org/jira/browse/HIVE-4662
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 0.11.0
>Reporter: Frans Drijver
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.0
>
> Attachments: HIVE-4662.01.patch, HIVE-4662.01.patch, 
> HIVE-4662.01.patch, HIVE-4662.patch
>
>
> In the current implementation of the first_value function, it's not allowed 
> to have more than one (1) order by column, as so:
> {quote}
> select distinct 
> first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by 
> kastr.DETRADT, kastr.DEVPDNR )
> from RTAVP_DRKASTR kastr
> ;
> {quote}
> Error given:
> {quote}
> FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on sprak

2016-03-18 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198571#comment-15198571
 ] 

Rui Li commented on HIVE-13293:
---

My understanding is that to do the sampling, we need to compute the RDD, which 
can be a big overhead for complicated queries. Therefore the optimization only 
works for queries where order by is dominant. The best use case should be just 
ordering a big table. For complicated queries, the re-computation of RDD may 
eventually hurt the performance.
MR doesn't have this problem because MR launches a separate job to do the 
ordering, and the data to be sampled is already on HDFS.

I think one possible solution is that we can break the spark work at parallel 
order by, i.e. just as MR, we compute everything to be sorted, and then launch 
a separate spark job to just do the ordering. I can do a PoC to see how this 
works.
[~xuefuz] what do you think?

> Query occurs performance degradation after enabling parallel order by for 
> Hive on sprak
> ---
>
> Key: HIVE-13293
> URL: https://issues.apache.org/jira/browse/HIVE-13293
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Lifeng Wang
>
> I use TPCx-BB to do some performance test on Hive on Spark engine. And found 
> query 10 has performance degradation when enabling parallel order by.
> It seems that sampling cost much time before running the real query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13283:

Attachment: HIVE-13283.01.patch

Fixing some bug and updating some out files.

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.01.patch, HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13240) GroupByOperator: Drop the hash aggregates when closing operator

2016-03-18 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13240:
---
Attachment: (was: HIVE-13240.2.patch)

> GroupByOperator: Drop the hash aggregates when closing operator
> ---
>
> Key: HIVE-13240
> URL: https://issues.apache.org/jira/browse/HIVE-13240
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13240.1.patch
>
>
> GroupByOperator holds onto the Hash aggregates accumulated when the plan is 
> cached.
> Drop the hashAggregates in case of error during forwarding to the next 
> operator.
> Added for PTF, TopN and all GroupBy cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4662) first_value can't have more than one order by column

2016-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197932#comment-15197932
 ] 

Ashutosh Chauhan commented on HIVE-4662:


yeah... we can take that up in follow-on. +1

> first_value can't have more than one order by column
> 
>
> Key: HIVE-4662
> URL: https://issues.apache.org/jira/browse/HIVE-4662
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 0.11.0
>Reporter: Frans Drijver
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-4662.01.patch, HIVE-4662.01.patch, 
> HIVE-4662.01.patch, HIVE-4662.patch
>
>
> In the current implementation of the first_value function, it's not allowed 
> to have more than one (1) order by column, as so:
> {quote}
> select distinct 
> first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by 
> kastr.DETRADT, kastr.DEVPDNR )
> from RTAVP_DRKASTR kastr
> ;
> {quote}
> Error given:
> {quote}
> FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2016-03-18 Thread Rajat Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal reopened HIVE-4570:


reopening as the external handle is not getting transferred. 

> More information to user on GetOperationStatus in Hive Server2 when query is 
> still executing
> 
>
> Key: HIVE-4570
> URL: https://issues.apache.org/jira/browse/HIVE-4570
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-4570.01.patch, HIVE-4570.02.patch, 
> HIVE-4570.03.patch, HIVE-4570.03.patch, HIVE-4570.04.patch, 
> HIVE-4570.04.patch, HIVE-4570.06.patch, HIVE-4570.07.patch
>
>
> Currently in Hive Server2, when the query is still executing only the status 
> is set as STILL_EXECUTING. 
> This issue is to give more information to the user such as progress and 
> running job handles, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-03-18 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13269:
---
Attachment: HIVE-13269.02.patch

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7292) Hive on Spark

2016-03-18 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-7292:
-
Issue Type: Improvement  (was: Wish)

> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>  Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-18 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13261:
---
Attachment: (was: HIVE-13261.01.patch)

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201198#comment-15201198
 ] 

Gopal V commented on HIVE-13310:


Is this related to HIVE-13220?

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13310.01.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13303) spill to YARN directories, not tmp, when available

2016-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13303:

Attachment: HIVE-13303.patch

The patch. I looked around for a while for the potential other use of tmp dirs 
(such as ConfVars.LOCALSCRATCHDIR, as well as Utils.toTempPath stuff that 
creates a temporary directory inside some base directory), but couldn't see 
anything obvious. 
The approach similar to the one here can be used if we find more.

> spill to YARN directories, not tmp, when available
> --
>
> Key: HIVE-13303
> URL: https://issues.apache.org/jira/browse/HIVE-13303
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13303.patch
>
>
> RowContainer::setupWriter, HybridHashTableContainer::spillPartition, 
> (KeyValueContainer|ObjectContainer)::setupOutput, 
> VectorMapJoinRowBytesContainer::setupOutputFileStreams create files in tmp. 
> Maybe some other code does it too, those are the ones I see on the execution 
> path. When there are multiple YARN output directories and multiple tasks 
> running on a machine, it's better to use the YARN directories. The only 
> question is cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13298) nested join support causes undecipherable errors in SemanticAnalyzer

2016-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200125#comment-15200125
 ] 

Sergey Shelukhin commented on HIVE-13298:
-

[~jcamachorodriguez] fyi also. Do you know about the join semantics (the above 
comment)? Is some change needed for CBO?

> nested join support causes undecipherable errors in SemanticAnalyzer
> 
>
> Key: HIVE-13298
> URL: https://issues.apache.org/jira/browse/HIVE-13298
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13298.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13193) Enable the complication in parallel in single session

2016-03-18 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202092#comment-15202092
 ] 

Aihua Xu commented on HIVE-13193:
-

+[~ashutoshc]  [~thejas] [~sershe] [~szehon], [~ctang.ma] [~ychena] for further 
information.

I'm working on this task right now. First, I'm refactoring the code to add 
QueryState which separates the query related info from SessionState and then 
each query execution will interact with QueryState for query info and 
SessionState for session info. 'set' command will interact with the session. 
Query history shared across the queries in SessionState should be synchronized. 
QueryState (including queryConf, queryId, queryString, etc) is visible to the 
query and its subtasks.

I'm wondering if you guys have any advices that I should be careful with.

> Enable the complication in parallel in single session
> -
>
> Key: HIVE-13193
> URL: https://issues.apache.org/jira/browse/HIVE-13193
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Follow up on HIVE-4239. Investigate the needed change to support parallel 
> complication in the same session. 
> Some operation related stuff should be in OperationState rather than in 
> SessionState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-18 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11388:
--
Status: Patch Available  (was: Open)

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-18 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11388:
--
Attachment: HIVE-11388.4.patch

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13313) TABLESAMPLE ROWS feature broken for Vectorization

2016-03-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13313:

Summary: TABLESAMPLE ROWS feature broken for Vectorization  (was: Row Limit 
Per Split feature broken for Vectorization)

> TABLESAMPLE ROWS feature broken for Vectorization
> -
>
> Key: HIVE-13313
> URL: https://issues.apache.org/jira/browse/HIVE-13313
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> For vectorization, the ROWS clause is ignored causing many rows to be 
> inserted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13313) TABLESAMPLE ROWS feature broken for Vectorization

2016-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202464#comment-15202464
 ] 

Sergey Shelukhin commented on HIVE-13313:
-

+1 pending tests

> TABLESAMPLE ROWS feature broken for Vectorization
> -
>
> Key: HIVE-13313
> URL: https://issues.apache.org/jira/browse/HIVE-13313
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13313.01.patch
>
>
> For vectorization, the ROWS clause is ignored causing many rows to be 
> returned.
> SELECT * FROM source TABLESAMPLE(10 ROWS);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13190) Vectorization: Dummy table row-limits get multiplied 100x accidentally

2016-03-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-13190.
-
Resolution: Duplicate

https://issues.apache.org/jira/browse/HIVE-13313

> Vectorization: Dummy table row-limits get multiplied 100x accidentally
> --
>
> Key: HIVE-13190
> URL: https://issues.apache.org/jira/browse/HIVE-13190
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
>
> 100x more rows are produced from dummy tables.
> {code}
> hive> select count(1) from (select * from (Select 1 a) x order by x.a) y;
> 100
> Time taken: 0.913 seconds, Fetched: 1 row(s)
> hive> 
> {code}
> simpler example.
> {code}
> hive> create temporary table dual as select 1;
> Table default.dual stats: [numFiles=1, numRows=100, totalSize=200, 
> rawDataSize=100]
> OK
> Time taken: 1.482 seconds
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13241) LLAP: Incremental Caching marks some small chunks as "incomplete CB"

2016-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13241:

Attachment: HIVE-13241.01.patch

> LLAP: Incremental Caching marks some small chunks as "incomplete CB"
> 
>
> Key: HIVE-13241
> URL: https://issues.apache.org/jira/browse/HIVE-13241
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13241.01.patch, HIVE-13241.patch
>
>
> Run #3 of a query with 1 node still has cache misses.
> {code}
> LLAP IO Summary
> --
>   VERTICES ROWGROUPS  META_HIT  META_MISS  DATA_HIT  DATA_MISS  ALLOCATION
>  USED  TOTAL_IO
> --
>  Map 111  1116  01.65GB93.61MB  0B
>0B32.72s
> --
> {code}
> {code}
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking 
> 0x1c44401d(1) due to reuse
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an 
> already-uncompressed buffer 0x1c44401d(2)
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking 
> 0x4e51b032(1) due to reuse
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an 
> already-uncompressed buffer 0x4e51b032(2)
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:addOneCompressionBuffer(1161)) - Found CB at 1373931, 
> chunk length 86587, total 86590, compressed
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:addIncompleteCompressionBuffer(1241)) - Replacing 
> data range [1373931, 1408408), size: 34474(!) type: direct (and 0 previous 
> chunks) with incomplete CB start: 1373931 end: 1408408 in the buffers
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:createRgColumnStreamData(441)) - Getting data for 
> column 7 RG 14 stream DATA at 1460521, 319811 index position 0: compressed 
> [1626961, 1780332)
> {code}
> {code}
> 2016-03-08T21:05:38,925 INFO  
> [IO-Elevator-Thread-7[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.OrcEncodedDataReader (OrcEncodedDataReader.java:readFileData(878)) - 
> Disk ranges after disk read (file 5372745, base offset 3): [{start: 18986 
> end: 20660 cache buffer: 0x660faf7c(1)}, {start: 20660 end: 35775 cache 
> buffer: 0x1dcb1d97(1)}, {start: 318852 end: 422353 cache buffer: 
> 0x6c7f9a05(1)}, {start: 1148616 end: 1262468 cache buffer: 0x196e1d41(1)}, 
> {start: 1262468 end: 1376342 cache buffer: 0x201255f(1)}, {data range 
> [1376342, 1410766), size: 34424 type: direct}, {start: 1631359 end: 1714694 
> cache buffer: 0x47e3a72d(1)}, {start: 1714694 end: 1785770 cache buffer: 
> 0x57dca266(1)}, {start: 4975035 end: 5095215 cache buffer: 0x3e3139c9(1)}, 
> {start: 5095215 end: 5197863 cache buffer: 0x3511c88d(1)}, {start: 7448387 
> end: 7572268 cache buffer: 0x6f11dbcd(1)}, {start: 7572268 end: 7696182 cache 
> buffer: 0x5d6c9bdb(1)}, {data range [7696182, 7710537), size: 14355 type: 
> direct}, {start: 8235756 end: 8345367 cache buffer: 0x6a241ece(1)}, {start: 
> 8345367 end: 8455009 cache buffer: 0x51caf6a7(1)}, {data range [8455009, 
> 8497906), size: 42897 type: direct}, {start: 9035815 end: 9159708 cache 
> buffer: 0x306480e0(1)}, {start: 9159708 end: 9283629 cache buffer: 
> 0x9ef7774(1)}, {data range [9283629, 9297965), size: 14336 type: direct}, 
> {start: 9989884 end: 10113731 cache buffer: 0x43f7cae9(1)}, {start: 10113731 
> end: 10237589 cache buffer: 0x458e63fe(1)}, {data range [10237589, 10252034), 
> size: 14445 type: direct}, {start: 11897896 end: 12021787 cache buffer: 
> 0x51f9982f(1)}, {start: 12021787 end: 12145656 cache buffer: 0x23df01b3(1)}, 
> {data range [12145656, 12160046), size: 14390 type: direct}, {start: 12851928 
> end: 12975795 cache 

[jira] [Updated] (HIVE-13298) nested join support causes undecipherable errors in SemanticAnalyzer

2016-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13298:

Attachment: HIVE-13298.01.patch

Replacing the switch with a TODO

> nested join support causes undecipherable errors in SemanticAnalyzer
> 
>
> Key: HIVE-13298
> URL: https://issues.apache.org/jira/browse/HIVE-13298
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13298.01.patch, HIVE-13298.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12976) MetaStoreDirectSql doesn't batch IN lists in all cases

2016-03-18 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202459#comment-15202459
 ] 

Sushanth Sowmyan commented on HIVE-12976:
-

1.3 is not a released version, so for branch-1, it is open for backports.

If your question was for branch-1.2 targetting 1.2.2, then yes, this is 
approved for backport as well.

> MetaStoreDirectSql doesn't batch IN lists in all cases
> --
>
> Key: HIVE-12976
> URL: https://issues.apache.org/jira/browse/HIVE-12976
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Fix For: 2.1.0
>
> Attachments: HIVE-12976.01.patch, HIVE-12976.02.patch, 
> HIVE-12976.patch
>
>
> That means that some RDBMS products with arbitrary limits cannot run these 
> queries. I hope HBase metastore comes soon and delivers us from Oracle! For 
> now, though, we have to fix this.
> {code}
> hive> select * from lineitem where l_orderkey = 121201;
> Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming 
> request has too many parameters. The server supports a maximum of 2100 
> parameters. Reduce the number of parameters and resend the request.
> at 
> com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:215)
>  ~[sqljdbc41.jar:?]
> at 
> com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1635)
>  ~[sqljdbc41.jar:?]
> at 
> com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(SQLServerPreparedStatement.java:426)
>  ~[sqljdbc41.jar:?]
> {code}
> {code}
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.1.jar:?]
> at 
> org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:388) 
> ~[datanucleus-api-jdo-4.2.1.jar:?]
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:264) 
> ~[datanucleus-api-jdo-4.2.1.jar:?]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1681)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.partsFoundForPartitions(MetaStoreDirectSql.java:1266)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1196)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6742)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6738)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2525)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6738)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13269) Simplify comparison expressions using column stats

2016-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202535#comment-15202535
 ] 

Hive QA commented on HIVE-13269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793945/HIVE-13269.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9836 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7308/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7308/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7308/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793945 - PreCommit-HIVE-TRUNK-Build

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13183) More logs in operation logs

2016-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202448#comment-15202448
 ] 

Sergey Shelukhin commented on HIVE-13183:
-

This broke tez_join_hash test (and auto_sortmerge_join_8 too, I suspect). Can 
you please fix or revert?

> More logs in operation logs
> ---
>
> Key: HIVE-13183
> URL: https://issues.apache.org/jira/browse/HIVE-13183
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13183.02.patch, HIVE-13183.03.patch, 
> HIVE-13183.04.patch, HIVE-13183.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13298) nested join support causes undecipherable errors in SemanticAnalyzer

2016-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202414#comment-15202414
 ] 

Hive QA commented on HIVE-13298:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793906/HIVE-13298.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9821 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7307/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7307/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7307/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793906 - PreCommit-HIVE-TRUNK-Build

> nested join support causes undecipherable errors in SemanticAnalyzer
> 
>
> Key: HIVE-13298
> URL: https://issues.apache.org/jira/browse/HIVE-13298
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13298.01.patch, HIVE-13298.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13313) TABLESAMPLE ROWS feature broken for Vectorization

2016-03-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13313:

Description: 
For vectorization, the ROWS clause is ignored causing many rows to be returned.

SELECT * FROM source TABLESAMPLE(10 ROWS);


  was:
For vectorization, the ROWS clause is ignored causing many rows to be inserted.




> TABLESAMPLE ROWS feature broken for Vectorization
> -
>
> Key: HIVE-13313
> URL: https://issues.apache.org/jira/browse/HIVE-13313
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> For vectorization, the ROWS clause is ignored causing many rows to be 
> returned.
> SELECT * FROM source TABLESAMPLE(10 ROWS);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197813#comment-15197813
 ] 

Ashutosh Chauhan commented on HIVE-11424:
-

It seems partition pruning and partition condition remover dont handle IN 
clauses very well, which is getting exposed by this change.

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.04.patch, 
> HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses (when 
> possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-18 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197407#comment-15197407
 ] 

Gopal V commented on HIVE-11424:


Thanks [~jcamachorodriguez] for the patch. This patch allows us to fold 
transitive inferences and fix issues.

{code}
sold_date between '2014-01-01'  and '2014-02-01' and 
store_sk IN (1,2,3) and 
(sold_date >= '2014-01-01' and store_sk=1) or (sold_date >= '2014-01-01' and 
store_sk=2)
{code}

gets neatly folded by the partition condition remover to 

{code}
true and 
store_sk IN (1,2,3) and 
(true and store_sk=1) or (true and store_sk=2)
{code}

This is currently left in place as 

{code}
store_sk IN (1,2,3) and (store_sk=1 and store_sk=2)
{code}

We need to fold that into 

{code}
store_sk IN (1,2)
{code}

to get our stats right or map-join conversions can get turned on by artificial 
application of the filter twice on the statistics.

This can be done via the DNF expansion of IN & then refolding it back, but it 
is much slower to do that than a set intersection (or union for OR) for 
unordered IN() expressions.

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.04.patch, 
> HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses (when 
> possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13298) nested join support causes undecipherable errors in SemanticAnalyzer

2016-03-18 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201381#comment-15201381
 ] 

Jesus Camacho Rodriguez commented on HIVE-13298:


[~sershe], thanks for pinging me. I do not know if I got properly the problem 
you are facing; is it the order of the columns for full outer join? Or for 
every kind of join? Got confused with the inner join semantics comment...

I think SQL standard specifies that column order for * should be the order of 
the columns in the table. For the query given, we create a cartesian product of 
tables {{a,b}}, and then we filter them. Thus, I would assume it is correct to 
set the order of columns to: _columns of a + columns of b_ in that case. Did 
you mean that by "switching the join sides"?

> nested join support causes undecipherable errors in SemanticAnalyzer
> 
>
> Key: HIVE-13298
> URL: https://issues.apache.org/jira/browse/HIVE-13298
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13298.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13298) nested join support causes undecipherable errors in SemanticAnalyzer

2016-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13298:

Status: Patch Available  (was: Open)

> nested join support causes undecipherable errors in SemanticAnalyzer
> 
>
> Key: HIVE-13298
> URL: https://issues.apache.org/jira/browse/HIVE-13298
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13298.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13300) Hive on spark throws exception for multi-insert

2016-03-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-13300:
-
Assignee: (was: Xuefu Zhang)

> Hive on spark throws exception for multi-insert
> ---
>
> Key: HIVE-13300
> URL: https://issues.apache.org/jira/browse/HIVE-13300
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Szehon Ho
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:251)
>   ... 12 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:241)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:249)
>   ... 12 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:597)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:288)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:237)
>   ... 13 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-18 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13261:
---
Attachment: HIVE-13261.01.patch

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13314) Hive on spark mapjoin errors if spark.master is not set

2016-03-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-13314:
-
Description: 
There are some errors that happen if spark.master is not set.

This is despite the code defaulting to yarn-cluster if spark.master is not set 
by user or on the config files: 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java#L51]

The funny thing is that while it works the first time due to this default, 
subsequent tries will fail as the hiveConf is refreshed without that default 
being set.

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java#L180]

Exception is follows:
{noformat}
Job aborted due to stage failure: Task 40 in stage 1.0 failed 4 times, most 
recent failure: Lost task 40.3 in stage 1.0 (TID 22, d2409.halxg.cloudera.com): 
java.lang.RuntimeException: Error processing row: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:154)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
at 
org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at 
org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:117)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:197)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:223)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:490)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)
... 16 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.isDedicatedCluster(SparkUtilities.java:108)
at 
org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:124)
at 
org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:114)
... 24 more

Driver stacktrace:
{noformat}

  was:
There are some errors that happen if spark.master is not set.

This is despite the code defaulting to yarn-cluster if spark.master is not set 
by user or on the config files: 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java#L51]

The funny thing is that while it works the first time due to this default, 
subsequent tries will fail as the hiveConf is refreshed without that default 
being set.

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java#L180]

Exception is follows:
{noformat}
Job aborted due to stage failure: Task 40 in stage 1.0 failed 4 times, most 
recent failure: Lost task 40.3 in stage 1.0 (TID 22, d2409.halxg.cloudera.com): 
java.lang.RuntimeException: Error processing row: 
org.apache.hadoop.hiv

[jira] [Updated] (HIVE-13313) TABLESAMPLE ROWS feature broken for Vectorization

2016-03-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13313:

Attachment: HIVE-13313.01.patch

> TABLESAMPLE ROWS feature broken for Vectorization
> -
>
> Key: HIVE-13313
> URL: https://issues.apache.org/jira/browse/HIVE-13313
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13313.01.patch
>
>
> For vectorization, the ROWS clause is ignored causing many rows to be 
> returned.
> SELECT * FROM source TABLESAMPLE(10 ROWS);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-13183) More logs in operation logs

2016-03-18 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu reopened HIVE-13183:


Reverted the commit.

Thanks [~sershe] for noting the broken test. I though the failure was unrelated.

> More logs in operation logs
> ---
>
> Key: HIVE-13183
> URL: https://issues.apache.org/jira/browse/HIVE-13183
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13183.02.patch, HIVE-13183.03.patch, 
> HIVE-13183.04.patch, HIVE-13183.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13115) MetaStore Direct SQL getPartitions call fail when the columns schemas for a partition are null

2016-03-18 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-13115:
---
Status: Patch Available  (was: Open)

Attaching patch with unit test.
The patch makes the DirectSql code consistent with the ORM layer.

> MetaStore Direct SQL getPartitions call fail when the columns schemas for a 
> partition are null
> --
>
> Key: HIVE-13115
> URL: https://issues.apache.org/jira/browse/HIVE-13115
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
> Attachments: HIVE-13115.patch, HIVE-13115.reproduce.issue.patch
>
>
> We are seeing the following exception in our MetaStore logs
> {noformat}
> 2016-02-11 00:00:19,002 DEBUG metastore.MetaStoreDirectSql 
> (MetaStoreDirectSql.java:timingTrace(602)) - Direct SQL query in 5.842372ms + 
> 1.066728ms, the query is [select "PARTITIONS"."PART_ID" from "PARTITIONS"  
> inner join "TBLS" on "PART
> ITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ?   inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"  and "DBS"."NAME" = ?  order by 
> "PART_NAME" asc]
> 2016-02-11 00:00:19,021 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(2243)) - Direct SQL failed, falling 
> back to ORM
> MetaException(message:Unexpected null for one of the IDs, SD 6437, column 
> null, serde 6437 for a non- view)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:360)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:224)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1563)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1559)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1570)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1553)
> at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
> at com.sun.proxy.$Proxy5.getPartitions(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2526)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8747)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8731)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:617)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:613)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1591)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:613)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This direct SQL call fails for every {{getPartitions}} call and then falls 
> back to ORM.
> The query which fails is
> {code}
> select 
>   PARTITIONS.PART_ID, SDS.SD_ID, SDS.CD_ID,
>   SERDES.SERDE_ID, PARTITIONS.CREATE_TIME,
>   PARTITIONS.LAST_ACCESS_TIME, SDS.INPUT_FORMAT, SDS.IS_COMPRESSED,
>   SDS.IS_STOREDASSUBDIRECTORIES, SDS.LOCATION, SDS.NUM_BUCKETS,
>   SDS.OUTPUT_FORMAT, SERDES.NAME, SERDES.SLIB 
> from PARTITIONS
>   left outer join SDS on PARTITIONS.SD_ID = SDS.SD_ID 
>   left outer join SERDES on SDS.SERDE_ID = SERDES.SERDE_ID 
>   where PART_ID in (  ?  ) order by PART_NAME asc

[jira] [Commented] (HIVE-13313) TABLESAMPLE ROWS feature broken for Vectorization

2016-03-18 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202426#comment-15202426
 ] 

Matt McCline commented on HIVE-13313:
-

[~sershe] Can you give this a quick +1 (it solves HIVE-13190, too).  Thanks

> TABLESAMPLE ROWS feature broken for Vectorization
> -
>
> Key: HIVE-13313
> URL: https://issues.apache.org/jira/browse/HIVE-13313
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13313.01.patch
>
>
> For vectorization, the ROWS clause is ignored causing many rows to be 
> returned.
> SELECT * FROM source TABLESAMPLE(10 ROWS);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13267) Vectorization: Add SelectLikeStringColScalar for non-filter operations

2016-03-18 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13267:
---
Attachment: (was: HIVE-13267.1.patch)

> Vectorization: Add SelectLikeStringColScalar for non-filter operations
> --
>
> Key: HIVE-13267
> URL: https://issues.apache.org/jira/browse/HIVE-13267
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13267.1.patch
>
>
> FilterStringColLikeStringScalar only applies to the values within filter 
> clauses.
> Borrow the Checker impls and extend to the value generation - for cases like
> select col is null or col like '%(null)%' 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13115) MetaStore Direct SQL getPartitions call fail when the columns schemas for a partition are null

2016-03-18 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-13115:
---
Attachment: HIVE-13115.patch

> MetaStore Direct SQL getPartitions call fail when the columns schemas for a 
> partition are null
> --
>
> Key: HIVE-13115
> URL: https://issues.apache.org/jira/browse/HIVE-13115
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
> Attachments: HIVE-13115.patch, HIVE-13115.reproduce.issue.patch
>
>
> We are seeing the following exception in our MetaStore logs
> {noformat}
> 2016-02-11 00:00:19,002 DEBUG metastore.MetaStoreDirectSql 
> (MetaStoreDirectSql.java:timingTrace(602)) - Direct SQL query in 5.842372ms + 
> 1.066728ms, the query is [select "PARTITIONS"."PART_ID" from "PARTITIONS"  
> inner join "TBLS" on "PART
> ITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ?   inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"  and "DBS"."NAME" = ?  order by 
> "PART_NAME" asc]
> 2016-02-11 00:00:19,021 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(2243)) - Direct SQL failed, falling 
> back to ORM
> MetaException(message:Unexpected null for one of the IDs, SD 6437, column 
> null, serde 6437 for a non- view)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:360)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:224)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1563)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1559)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1570)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1553)
> at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
> at com.sun.proxy.$Proxy5.getPartitions(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2526)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8747)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8731)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:617)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:613)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1591)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:613)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This direct SQL call fails for every {{getPartitions}} call and then falls 
> back to ORM.
> The query which fails is
> {code}
> select 
>   PARTITIONS.PART_ID, SDS.SD_ID, SDS.CD_ID,
>   SERDES.SERDE_ID, PARTITIONS.CREATE_TIME,
>   PARTITIONS.LAST_ACCESS_TIME, SDS.INPUT_FORMAT, SDS.IS_COMPRESSED,
>   SDS.IS_STOREDASSUBDIRECTORIES, SDS.LOCATION, SDS.NUM_BUCKETS,
>   SDS.OUTPUT_FORMAT, SERDES.NAME, SERDES.SLIB 
> from PARTITIONS
>   left outer join SDS on PARTITIONS.SD_ID = SDS.SD_ID 
>   left outer join SERDES on SDS.SERDE_ID = SERDES.SERDE_ID 
>   where PART_ID in (  ?  ) order by PART_NAME asc;
> {code}
> By looking at the source {{MetaStoreDirectSql.java}}, the third column in the 
> query ( SDS.C

[jira] [Updated] (HIVE-13115) MetaStore Direct SQL getPartitions call fail when the columns schemas for a partition are null

2016-03-18 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-13115:
---
Labels: DirectSql MetaStore ORM  (was: )

> MetaStore Direct SQL getPartitions call fail when the columns schemas for a 
> partition are null
> --
>
> Key: HIVE-13115
> URL: https://issues.apache.org/jira/browse/HIVE-13115
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: DirectSql, MetaStore, ORM
> Attachments: HIVE-13115.patch, HIVE-13115.reproduce.issue.patch
>
>
> We are seeing the following exception in our MetaStore logs
> {noformat}
> 2016-02-11 00:00:19,002 DEBUG metastore.MetaStoreDirectSql 
> (MetaStoreDirectSql.java:timingTrace(602)) - Direct SQL query in 5.842372ms + 
> 1.066728ms, the query is [select "PARTITIONS"."PART_ID" from "PARTITIONS"  
> inner join "TBLS" on "PART
> ITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ?   inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"  and "DBS"."NAME" = ?  order by 
> "PART_NAME" asc]
> 2016-02-11 00:00:19,021 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(2243)) - Direct SQL failed, falling 
> back to ORM
> MetaException(message:Unexpected null for one of the IDs, SD 6437, column 
> null, serde 6437 for a non- view)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:360)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:224)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1563)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1559)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1570)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1553)
> at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
> at com.sun.proxy.$Proxy5.getPartitions(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2526)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8747)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8731)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:617)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:613)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1591)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:613)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This direct SQL call fails for every {{getPartitions}} call and then falls 
> back to ORM.
> The query which fails is
> {code}
> select 
>   PARTITIONS.PART_ID, SDS.SD_ID, SDS.CD_ID,
>   SERDES.SERDE_ID, PARTITIONS.CREATE_TIME,
>   PARTITIONS.LAST_ACCESS_TIME, SDS.INPUT_FORMAT, SDS.IS_COMPRESSED,
>   SDS.IS_STOREDASSUBDIRECTORIES, SDS.LOCATION, SDS.NUM_BUCKETS,
>   SDS.OUTPUT_FORMAT, SERDES.NAME, SERDES.SLIB 
> from PARTITIONS
>   left outer join SDS on PARTITIONS.SD_ID = SDS.SD_ID 
>   left outer join SERDES on SDS.SERDE_ID = SERDES.SERDE_ID 
>   where PART_ID in (  ?  ) order by PART_NAME asc;
> {code}
> By looking at the source {{MetaSt