[jira] [Commented] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-15 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421700#comment-15421700
 ] 

Thomas Friedrich commented on HIVE-14533:
-

I checked the test failures and they are not related to the patch. The 
Tez-related tests fail in other prebuilds as well and I ran 
TestJdbcWithMiniHS2.testAddJarConstructorUnCaching successfully locally.

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-12 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-14533:

Status: Patch Available  (was: Open)

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.1.0, 1.2.1
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-12 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-14533:

Status: Open  (was: Patch Available)

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.1.0, 1.2.1
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-12 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-14533:

Labels: performance  (was: )
Status: Patch Available  (was: Open)

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.1.0, 1.2.1
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-12 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419657#comment-15419657
 ] 

Thomas Friedrich commented on HIVE-14533:
-

The patch adds a check to enforceMaxLength to only enforce the maxLength if the 
string is longer than maxLength. This check can be done without decoding the 
string, so it saves the unnecessary decoding of every value.

HiveVarcharWritable: if (value.getLength()>maxLength && 
getCharacterLength()>maxLength)
- value.getLength is the number of bytes of the string
- maxLength is the max number of characters
For single-byte characters, the number of bytes is similar to the number of 
characters. For double-byte characters, the number of characters is less than 
the number of bytes. If the number of bytes is lower than maxLength, then the 
string has fewer than maxLength characters and we don't have to truncate the 
string. If the number of bytes is larger than the number of characters, we need 
to compare the characterLength with the maxLength. We could just compare 
getCharacterLength()>maxLength in any case, but getCharacterLength calls 
getTextUtfLength which takes more time than just comparing the byte length with 
maxLength.

HiveCharwritable: if (getCharacterLength()!=maxLength)
For char values, we can only compare the number of characters with the 
maxLength and if it's different we need to call set to enforce the right 
length. This is to ensure we get the padded value if the string is not long 
enough and to truncate it in case it's longer. If we were to compare the bytes 
(value.getLength()) with maxLength, then it might not enforce the maxLength if 
double-byte characters are involved.



> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-12 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-14533:

Attachment: HIVE-14533.patch

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-25 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392426#comment-15392426
 ] 

Thomas Friedrich commented on HIVE-13422:
-

Failed tests not related to fix.

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
> Attachments: HIVE-13422.patch
>
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-22 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-13422:

Status: Open  (was: Patch Available)

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
> Attachments: HIVE-13422.patch
>
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-22 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-13422:

Status: Patch Available  (was: Open)

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
> Attachments: HIVE-13422.patch
>
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-22 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-13422:

Status: Patch Available  (was: Open)

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
> Attachments: HIVE-13422.patch
>
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-22 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390323#comment-15390323
 ] 

Thomas Friedrich commented on HIVE-13422:
-

The problem is that the GenericUDAFDecimalStatsEvaluator is using the default 
HiveDecimalObjectInspector which is initialized with the default precision and 
scale of 38,18. While the analyze will work for some decimal columns, it fails 
for decimal columns where the difference of precision and scale is larger than 
20 (38-18), for example decimal(21,0).
I attached a patch that will initialize the HiveDecimalObjectInspector with the 
actual precision and scale of the decimal column. I also updated the 
compute_stats_decimal.q test case to test this scenario.

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
> Attachments: HIVE-13422.patch
>
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-22 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-13422:

Attachment: HIVE-13422.patch

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
> Attachments: HIVE-13422.patch
>
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-21 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388097#comment-15388097
 ] 

Thomas Friedrich commented on HIVE-13422:
-

I ran into the same issue and have a patch ready. Will upload shortly.

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13422) Analyse command not working for column having datatype as decimal(38,0)

2016-07-21 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich reassigned HIVE-13422:
---

Assignee: Thomas Friedrich

> Analyse command not working for column having datatype as decimal(38,0)
> ---
>
> Key: HIVE-13422
> URL: https://issues.apache.org/jira/browse/HIVE-13422
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Statistics
>Affects Versions: 1.1.0
>Reporter: ashim sinha
>Assignee: Thomas Friedrich
>
> For the repro
> {code}
> drop table sample_test;
> CREATE TABLE IF NOT EXISTS sample_test( key decimal(38,0),b int ) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> load data local inpath '/home/hive/analyse.txt' into table sample_test;
> ANALYZE TABLE sample_test COMPUTE STATISTICS FOR COLUMNS;
> {code}
> Sample data
> {code}
> 2023456789456749825082498304 0
> 5032080754887849825069508304 0
> 4012080754887849825068718304 0
> 2012080754887849825066778304 0
> 4012080754887849625065678304 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14210) SSLFactory truststore reloader threads leaking in HiveServer2

2016-07-11 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-14210:

Attachment: HIVE-14210.1.patch

> SSLFactory truststore reloader threads leaking in HiveServer2
> -
>
> Key: HIVE-14210
> URL: https://issues.apache.org/jira/browse/HIVE-14210
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
> Attachments: HIVE-14210.1.patch, HIVE-14210.patch
>
>
> We found an issue in a customer environment where the HS2 crashed after a few 
> days and the Java core dump contained several thousands of truststore 
> reloader threads:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> We found the issue to be caused by a bug in Hadoop where the 
> TimelineClientImpl is not destroying the SSLFactory if SSL is enabled in 
> Hadoop and the timeline server is running. I opened YARN-5309 which has more 
> details on the problem, and a patch was submitted a few days back.
> In addition to the changes in Hadoop, there are a couple of Hive changes 
> required:
> - ExecDriver needs to call jobclient.close() to trigger the clean-up of the 
> resources after the submitted job is done/failed
> - Hive needs to pick up a newer release of Hadoop to pick up MAPREDUCE-6618 
> and MAPREDUCE-6621 that fixed issues with calling jobclient.close(). Both 
> fixes are included in Hadoop 2.6.4. 
> However, since we also need to pick up YARN-5309, we need to wait for a new 
> release of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14210) SSLFactory truststore reloader threads leaking in HiveServer2

2016-07-11 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371944#comment-15371944
 ] 

Thomas Friedrich edited comment on HIVE-14210 at 7/12/16 12:12 AM:
---

Provided patch for ExecDriver.java to call jobclient.close()


was (Author: tfriedr):
Patch for ExecDriver.java

> SSLFactory truststore reloader threads leaking in HiveServer2
> -
>
> Key: HIVE-14210
> URL: https://issues.apache.org/jira/browse/HIVE-14210
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Thomas Friedrich
> Attachments: HIVE-14210.patch
>
>
> We found an issue in a customer environment where the HS2 crashed after a few 
> days and the Java core dump contained several thousands of truststore 
> reloader threads:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> We found the issue to be caused by a bug in Hadoop where the 
> TimelineClientImpl is not destroying the SSLFactory if SSL is enabled in 
> Hadoop and the timeline server is running. I opened YARN-5309 which has more 
> details on the problem, and a patch was submitted a few days back.
> In addition to the changes in Hadoop, there are a couple of Hive changes 
> required:
> - ExecDriver needs to call jobclient.close() to trigger the clean-up of the 
> resources after the submitted job is done/failed
> - Hive needs to pick up a newer release of Hadoop to pick up MAPREDUCE-6618 
> and MAPREDUCE-6621 that fixed issues with calling jobclient.close(). Both 
> fixes are included in Hadoop 2.6.4. 
> However, since we also need to pick up YARN-5309, we need to wait for a new 
> release of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14210) SSLFactory truststore reloader threads leaking in HiveServer2

2016-07-11 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich reassigned HIVE-14210:
---

Assignee: Thomas Friedrich

> SSLFactory truststore reloader threads leaking in HiveServer2
> -
>
> Key: HIVE-14210
> URL: https://issues.apache.org/jira/browse/HIVE-14210
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
> Attachments: HIVE-14210.patch
>
>
> We found an issue in a customer environment where the HS2 crashed after a few 
> days and the Java core dump contained several thousands of truststore 
> reloader threads:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> We found the issue to be caused by a bug in Hadoop where the 
> TimelineClientImpl is not destroying the SSLFactory if SSL is enabled in 
> Hadoop and the timeline server is running. I opened YARN-5309 which has more 
> details on the problem, and a patch was submitted a few days back.
> In addition to the changes in Hadoop, there are a couple of Hive changes 
> required:
> - ExecDriver needs to call jobclient.close() to trigger the clean-up of the 
> resources after the submitted job is done/failed
> - Hive needs to pick up a newer release of Hadoop to pick up MAPREDUCE-6618 
> and MAPREDUCE-6621 that fixed issues with calling jobclient.close(). Both 
> fixes are included in Hadoop 2.6.4. 
> However, since we also need to pick up YARN-5309, we need to wait for a new 
> release of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14210) SSLFactory truststore reloader threads leaking in HiveServer2

2016-07-11 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-14210:

Attachment: HIVE-14210.patch

Patch for ExecDriver.java

> SSLFactory truststore reloader threads leaking in HiveServer2
> -
>
> Key: HIVE-14210
> URL: https://issues.apache.org/jira/browse/HIVE-14210
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Thomas Friedrich
> Attachments: HIVE-14210.patch
>
>
> We found an issue in a customer environment where the HS2 crashed after a few 
> days and the Java core dump contained several thousands of truststore 
> reloader threads:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> We found the issue to be caused by a bug in Hadoop where the 
> TimelineClientImpl is not destroying the SSLFactory if SSL is enabled in 
> Hadoop and the timeline server is running. I opened YARN-5309 which has more 
> details on the problem, and a patch was submitted a few days back.
> In addition to the changes in Hadoop, there are a couple of Hive changes 
> required:
> - ExecDriver needs to call jobclient.close() to trigger the clean-up of the 
> resources after the submitted job is done/failed
> - Hive needs to pick up a newer release of Hadoop to pick up MAPREDUCE-6618 
> and MAPREDUCE-6621 that fixed issues with calling jobclient.close(). Both 
> fixes are included in Hadoop 2.6.4. 
> However, since we also need to pick up YARN-5309, we need to wait for a new 
> release of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12729) In Hive 1.2 - current_date() comparison results in Error Unsupported conversion from type: interval_day_time

2016-01-22 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113472#comment-15113472
 ] 

Thomas Friedrich commented on HIVE-12729:
-

You could use the Hive datediff function:
SELECT DISTINCT customerid FROM Customer_date WHERE 
datediff(Customer_date.my_date_mmd, CURRENT_DATE()) >= 7;

Note that you are subtracting the current date from the stored date, which is 
most likely a negative number unless the dates are in the future, which means 
the check >=7 will not return any rows. You might want to switch the arguments 
around:
datediff(CURRENT_DATE(), Customer_date.my_date_mmd).

> In Hive 1.2 - current_date() comparison results in Error Unsupported 
> conversion from type: interval_day_time 
> -
>
> Key: HIVE-12729
> URL: https://issues.apache.org/jira/browse/HIVE-12729
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Rahul
>
> I am using current_date() in my query where clause along with table column of 
> type date for comparison using artihmatic operator "-" (minus) and "<" / ">" 
> operators - for example:
> SELECT DISTINCT customerid FROM Customer_date WHERE 
>((Customer_date.my_date_mmd - CURRENT_DATE()) >= 7) 
> It results in error as:
> 
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1449057948397_0330_1_00, diagnostics=[Task failed, 
> taskId=task_1449057948397_0330_1_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
> {"customerid":1,"my_date_mmd":"1982-01-01","my_date_ddmmyy":"1982-01-01","my_date_ddmm":"1982-01-01","my_date_md":"1982-01-01","my_date_mdhh":"1982-01-01
>  00:00:00","my_date_mdhh24":"1982-01-01 00:00:00"}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"customerid":1,"my_date_mmd":"1982-01-01","my_date_ddmmyy":"1982-01-01","my_date_ddmm":"1982-01-01","my_date_md":"1982-01-01","my_date_mdhh":"1982-01-01
>  00:00:00","my_date_mdhh24":"1982-01-01 00:00:00"}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row 
> {"customerid":1,"my_date_mmd":"1982-01-01","my_date_ddmmyy":"1982-01-01","my_date_ddmm":"1982-01-01","my_date_md":"1982-01-01","my_date_mdhh":"1982-01-01
>  00:00:00","my_date_mdhh24":"1982-01-01 00:00:00"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:545)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Hive 2 Internal error: unsupported 
> conversion from type: 

[jira] [Commented] (HIVE-11312) ORC format: where clause with CHAR data type not returning any rows

2015-12-02 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037347#comment-15037347
 ] 

Thomas Friedrich commented on HIVE-11312:
-

Thanks, [~prasanth_j] for getting this fix in!

> ORC format: where clause with CHAR data type not returning any rows
> ---
>
> Key: HIVE-11312
> URL: https://issues.apache.org/jira/browse/HIVE-11312
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0, 1.3.0, 1.2.1, 2.0.0
>Reporter: Thomas Friedrich
>Assignee: Prasanth Jayachandran
>Priority: Blocker
>  Labels: orc
> Fix For: 1.3.0, 2.0.0, 2.1.0
>
> Attachments: HIVE-11312-branch-1.patch, HIVE-11312.1.patch, 
> HIVE-11312.2.patch, HIVE-11312.3.patch, HIVE-11312.4.patch, HIVE-11312.5.patch
>
>
> Test case:
> Setup: 
> create table orc_test( col1 string, col2 char(10)) stored as orc 
> tblproperties ("orc.compress"="NONE");
> insert into orc_test values ('val1', '1');
> Query:
> select * from orc_test where col2='1'; 
> Query returns no row.
> Problem is introduced with HIVE-10286, class RecordReaderImpl.java, method 
> evaluatePredicateRange.
> Old code:
> - Object baseObj = predicate.getLiteral(PredicateLeaf.FileFormat.ORC);
> - Object minValue = getConvertedStatsObj(min, baseObj);
> - Object maxValue = getConvertedStatsObj(max, baseObj);
> - Object predObj = getBaseObjectForComparison(baseObj, minValue);
> New code:
> + Object baseObj = predicate.getLiteral();
> + Object minValue = getBaseObjectForComparison(predicate.getType(), min);
> + Object maxValue = getBaseObjectForComparison(predicate.getType(), max);
> + Object predObj = getBaseObjectForComparison(predicate.getType(), baseObj);
> The values for min and max are of type String which contain as many 
> characters as the CHAR column indicated. For example if the type is CHAR(10), 
> and the row has value 1, the value of String min is "1 ";
> Before Hive 1.2, the method getConvertedStatsObj would call 
> StringUtils.stripEnd(statsObj.toString(), null); which would remove the 
> trailing spaces from min and max. Later in the compareToRange method, it was 
> able to compare "1" with "1".
> In Hive 1.2 with the use getBaseObjectForComparison method, it simply returns 
> obj.String if the data type is String, which means minValue and maxValue are 
> still "1 ".
> As a result, the compareToRange method will return a wrong value 
> ("1".compareTo("1 ")  -9 instead of 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-23 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022811#comment-15022811
 ] 

Thomas Friedrich commented on HIVE-10613:
-

Failed tests not related to change.

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12489) Analyze for partition fails if partition value has special characters

2015-11-23 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022825#comment-15022825
 ] 

Thomas Friedrich commented on HIVE-12489:
-

[~ashutoshc], can you help to commit the change? Thank you.

> Analyze for partition fails if partition value has special characters
> -
>
> Key: HIVE-12489
> URL: https://issues.apache.org/jira/browse/HIVE-12489
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-12489.patch
>
>
> When analyzing a partition that has a special characters in the value, the 
> analyze command fails with an exception. 
> Example:
> hive> create table testtable (a int) partitioned by (b string);
> hive> insert into table testtable  partition (b="p\"1") values (1);
> hive> ANALYZE TABLE testtable  PARTITION(b="p\"1") COMPUTE STATISTICS for 
> columns a;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12489) Analyze for partition fails if partition value has special characters

2015-11-23 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022809#comment-15022809
 ] 

Thomas Friedrich commented on HIVE-12489:
-

Failed tests not related to change.

> Analyze for partition fails if partition value has special characters
> -
>
> Key: HIVE-12489
> URL: https://issues.apache.org/jira/browse/HIVE-12489
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-12489.patch
>
>
> When analyzing a partition that has a special characters in the value, the 
> analyze command fails with an exception. 
> Example:
> hive> create table testtable (a int) partitioned by (b string);
> hive> insert into table testtable  partition (b="p\"1") values (1);
> hive> ANALYZE TABLE testtable  PARTITION(b="p\"1") COMPUTE STATISTICS for 
> columns a;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12489) Analyze for partition fails if partition value has special characters

2015-11-20 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-12489:

Attachment: HIVE-12489.patch

> Analyze for partition fails if partition value has special characters
> -
>
> Key: HIVE-12489
> URL: https://issues.apache.org/jira/browse/HIVE-12489
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-12489.patch
>
>
> When analyzing a partition that has a special characters in the value, the 
> analyze command fails with an exception. 
> Example:
> hive> create table testtable (a int) partitioned by (b string);
> hive> insert into table testtable  partition (b="p\"1") values (1);
> hive> ANALYZE TABLE testtable  PARTITION(b="p\"1") COMPUTE STATISTICS for 
> columns a;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-20 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: (was: HIVE-10613.patch)

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-20 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: HIVE-10613.patch

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-18 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: HIVE-10613.patch

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-18 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012543#comment-15012543
 ] 

Thomas Friedrich commented on HIVE-10613:
-

Updated patch with test cases.

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-18 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: (was: HIVE-10613.patch)

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-03 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: (was: HIVE-10613.1.patch)

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-03 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: HIVE-10613.patch

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11287) Hive Metastore does not tolerate leading spaces in JDBC url

2015-09-23 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-11287:

Assignee: Chen Xin Yu  (was: Thomas Friedrich)

> Hive Metastore does not tolerate leading spaces in JDBC url
> ---
>
> Key: HIVE-11287
> URL: https://issues.apache.org/jira/browse/HIVE-11287
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, Metastore
>Affects Versions: 1.2.0, 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Chen Xin Yu
>Priority: Minor
>  Labels: newbie
>
> The hive metastore is configured with
> {code}
>  
> 
>javax.jdo.option.ConnectionURL
> 
> 
>jdbc:mysql://hostname/hive
>
>   
> {code}
> The initialization fails with an error 
> {code}
> java.sql.SQLException: No suitable driver found for
> jdbc:mysql://hostname/hive
> at java.sql.DriverManager.getConnection(DriverManager.java:689)
> at java.sql.DriverManager.getConnection(DriverManager.java:208)
> at 
> com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
> at com.jolbox.bonecp.BoneCP.obtainInternalConnection(BoneCP.java:269)
> at 
> com.jolbox.bonecp.ConnectionHandle.(ConnectionHandle.java:242)
> at 
> com.jolbox.bonecp.PoolWatchThread.fillConnections(PoolWatchThread.java:115)
> at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:85)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11106) HiveServer2 JDBC (greater than v0.13.1) cannot connect to non-default database

2015-09-23 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-11106:

Assignee: Chen Xin Yu

> HiveServer2 JDBC (greater than v0.13.1) cannot connect to non-default database
> --
>
> Key: HIVE-11106
> URL: https://issues.apache.org/jira/browse/HIVE-11106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Tom Coleman
>Assignee: Chen Xin Yu
>
> Using HiveServer 0.14.0 or greater, I cannot connect a non-default database.
> For example when connecting to HiveServer to via the following URLs, the 
> session uses the 'default' database, instead of the intended database.
> jdbc://localhost:1/customDb
> This exact issue was fixed in 0.13.1 of HiveServer from 
> https://issues.apache.org/jira/browse/HIVE-5904 but for some reason this fix 
> was not ported to v0.14.0 or greater. From looking at the source, it looks as 
> if this fix was overriden by another change to the HiveConnection class, was 
> this intentional or a defect reintroduced from another defect fix?
> This means that we need to use 0.13.1 in order to connect to a non-default 
> database via JDBC and we cannot upgrade Hive versions. We don't want placing 
> a JDBC interceptor to inject "use customDb" each time a connection is 
> borrowed from the pool on production code. One should be able to connect 
> straight to the non-default database via the JDBC URL.
> Now it perhaps could be a simple oversight on my behalf in which the syntax 
> to connect to a non-default database has changed from 0.14.0 onwards but I'd 
> be grateful is this could be confirmed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11287) Hive Metastore does not tolerate leading spaces in JDBC url

2015-09-23 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich reassigned HIVE-11287:
---

Assignee: Thomas Friedrich

> Hive Metastore does not tolerate leading spaces in JDBC url
> ---
>
> Key: HIVE-11287
> URL: https://issues.apache.org/jira/browse/HIVE-11287
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, Metastore
>Affects Versions: 1.2.0, 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: newbie
>
> The hive metastore is configured with
> {code}
>  
> 
>javax.jdo.option.ConnectionURL
> 
> 
>jdbc:mysql://hostname/hive
>
>   
> {code}
> The initialization fails with an error 
> {code}
> java.sql.SQLException: No suitable driver found for
> jdbc:mysql://hostname/hive
> at java.sql.DriverManager.getConnection(DriverManager.java:689)
> at java.sql.DriverManager.getConnection(DriverManager.java:208)
> at 
> com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
> at com.jolbox.bonecp.BoneCP.obtainInternalConnection(BoneCP.java:269)
> at 
> com.jolbox.bonecp.ConnectionHandle.(ConnectionHandle.java:242)
> at 
> com.jolbox.bonecp.PoolWatchThread.fillConnections(PoolWatchThread.java:115)
> at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:85)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8346) MapRedLocalTask Error Handling

2015-09-23 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-8346:
---
Assignee: Chen Xin Yu

> MapRedLocalTask Error Handling
> --
>
> Key: HIVE-8346
> URL: https://issues.apache.org/jira/browse/HIVE-8346
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.14.0
>Reporter: Szehon Ho
>Assignee: Chen Xin Yu
>Priority: Minor
>  Labels: newbie
>
> If there are any exceptions trying to fork a local task, the exception 
> message is logged but not the stack trace.  There can be a lot of issues 
> forking a process, so we should log the stack trace for better debuggability.
> Code in MapRedLocalTask.executeInChildJVM(DriverContext ctx):
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   LOG.error("Exception: " + e.getMessage());
>   return (1);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11481) hive incorrectly set extended ACLs for unnamed group for new databases/tables with inheritPerms enabled

2015-09-10 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-11481:

Assignee: Carita Ou

> hive incorrectly set extended ACLs for unnamed group for new databases/tables 
> with inheritPerms enabled
> ---
>
> Key: HIVE-11481
> URL: https://issues.apache.org/jira/browse/HIVE-11481
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Carita Ou
>Assignee: Carita Ou
>Priority: Minor
>
> $ hadoop fs -chmod 700 /user/hive/warehouse
> $ hadoop fs -setfacl -m user:user1:rwx /user/hive/warehouse
> $ hadoop fs -setfacl -m default:user::rwx /user/hive/warehouse
> $ hadoop fs -ls /user/hive
> Found 1 items
> drwxrwx---+  - hive hadoop  0 2015-08-05 10:29 /user/hive/warehouse
> $ hadoop fs -getfacl /user/hive/warehouse
> # file: /user/hive/warehouse
> # owner: hive
> # group: hadoop
> user::rwx
> user:user1:rwx
> group::---
> mask::rwx
> other::---
> default:user::rwx
> default:group::---
> default:other::---
> In hive cli> create database testing;
> $ hadoop fs -ls /user/hive/warehouse
> Found 1 items
> drwxrwx---+  - hive hadoop  0 2015-08-05 10:44 
> /user/hive/warehouse/testing.db
> $hadoop fs -getfacl /user/hive/warehouse/testing.db
> # file: /user/hive/warehouse/testing.db
> # owner: hive
> # group: hadoop
> user::rwx
> user:user1:rwx
> group::rwx
> mask::rwx
> other::---
> default:user::rwx
> default:group::---
> default:other::---
> Since the warehouse directory has default group permission set to ---, the 
> group permissions for testing.db should also be ---
> The warehouse directory permissions show drwxrwx---+ which corresponds to 
> user:mask:other. The subdirectory group ACL is set by calling 
> FsPermission.getGroupAction() from Hadoop, which retrieves the file status 
> permission rwx instead of the actual ACL permission, which is ---. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11312) ORC format: where clause with CHAR data type not returning any rows

2015-08-03 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652203#comment-14652203
 ] 

Thomas Friedrich commented on HIVE-11312:
-

Thanks for looking at this, [~prasanth_j]. Feel free to assign the JIRA to 
yourself.

 ORC format: where clause with CHAR data type not returning any rows
 ---

 Key: HIVE-11312
 URL: https://issues.apache.org/jira/browse/HIVE-11312
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.2.1
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
  Labels: orc
 Attachments: HIVE-11312.1.patch, HIVE-11312.2.patch


 Test case:
 Setup: 
 create table orc_test( col1 string, col2 char(10)) stored as orc 
 tblproperties (orc.compress=NONE);
 insert into orc_test values ('val1', '1');
 Query:
 select * from orc_test where col2='1'; 
 Query returns no row.
 Problem is introduced with HIVE-10286, class RecordReaderImpl.java, method 
 evaluatePredicateRange.
 Old code:
 - Object baseObj = predicate.getLiteral(PredicateLeaf.FileFormat.ORC);
 - Object minValue = getConvertedStatsObj(min, baseObj);
 - Object maxValue = getConvertedStatsObj(max, baseObj);
 - Object predObj = getBaseObjectForComparison(baseObj, minValue);
 New code:
 + Object baseObj = predicate.getLiteral();
 + Object minValue = getBaseObjectForComparison(predicate.getType(), min);
 + Object maxValue = getBaseObjectForComparison(predicate.getType(), max);
 + Object predObj = getBaseObjectForComparison(predicate.getType(), baseObj);
 The values for min and max are of type String which contain as many 
 characters as the CHAR column indicated. For example if the type is CHAR(10), 
 and the row has value 1, the value of String min is 1 ;
 Before Hive 1.2, the method getConvertedStatsObj would call 
 StringUtils.stripEnd(statsObj.toString(), null); which would remove the 
 trailing spaces from min and max. Later in the compareToRange method, it was 
 able to compare 1 with 1.
 In Hive 1.2 with the use getBaseObjectForComparison method, it simply returns 
 obj.String if the data type is String, which means minValue and maxValue are 
 still 1 .
 As a result, the compareToRange method will return a wrong value 
 (1.compareTo(1 )  -9 instead of 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11113) ANALYZE TABLE .. COMPUTE STATISTICS FOR COLUMNS does not work.

2015-07-21 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634646#comment-14634646
 ] 

Thomas Friedrich commented on HIVE-3:
-

[~libing], [~pxiong], the error message Column [ds] was not found in schema! 
is a different problem than reported in this JIRA originally and is specific to 
parquet tables with partitions and not limited to analyze. I opened HIVE-11326 
for the parquet problem and added steps to reproduce the problem over there.

 ANALYZE TABLE .. COMPUTE STATISTICS FOR COLUMNS does not work. 
 ---

 Key: HIVE-3
 URL: https://issues.apache.org/jira/browse/HIVE-3
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 1.2.1
 Environment: 
Reporter: Shiroy Pigarez
Assignee: Pengcheng Xiong
Priority: Critical

 I was trying to perform some column statistics using hive as per the 
 documentation 
 https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive 
 and was encountering the following errors:
 Seems like a bug. Can you look into this? Thanks in advance.
 -- HIVE table
 {noformat}
 hive create table people_part(
 name string,
 address string) PARTITIONED BY (dob string, nationality varchar(2))
 row format delimited fields terminated by '\t';
 {noformat}
 --Analyze table with partition dob and nationality with FOR COLUMNS
 {noformat}
 hive ANALYZE TABLE people_part PARTITION(dob='2015-10-2',nationality) 
 COMPUTE STATISTICS FOR COLUMNS;
 NoViableAltException(-1@[])
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:11627)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:40215)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.columnName(HiveParser.java:33351)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.columnNameList(HiveParser.java:33219)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.analyzeStatement(HiveParser.java:17764)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2369)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:275)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:227)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:803)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:636)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 FAILED: ParseException line 1:95 cannot recognize input near 'EOF' 'EOF' 
 'EOF' in column name
 {noformat}
 --Analyze table with partition dob and nationality values specified with FOR 
 COLUMNS
 {noformat}
 hive ANALYZE TABLE people_part PARTITION(dob='2015-10-2',nationality='IE') 
 COMPUTE STATISTICS FOR COLUMNS;
 NoViableAltException(-1@[])
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:11627)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:40215)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.columnName(HiveParser.java:33351)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.columnNameList(HiveParser.java:33219)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.analyzeStatement(HiveParser.java:17764)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2369)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398)

[jira] [Commented] (HIVE-11312) ORC format: where clause with CHAR data type not returning any rows

2015-07-19 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632982#comment-14632982
 ] 

Thomas Friedrich commented on HIVE-11312:
-

[~prasanth_j] can you take a look at my proposed patch since it's related to 
HIVE-10286. Thank you.

 ORC format: where clause with CHAR data type not returning any rows
 ---

 Key: HIVE-11312
 URL: https://issues.apache.org/jira/browse/HIVE-11312
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.2.1
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
  Labels: orc
 Attachments: HIVE-11312.1.patch


 Test case:
 Setup: 
 create table orc_test( col1 string, col2 char(10)) stored as orc 
 tblproperties (orc.compress=NONE);
 insert into orc_test values ('val1', '1');
 Query:
 select * from orc_test where col2='1'; 
 Query returns no row.
 Problem is introduced with HIVE-10286, class RecordReaderImpl.java, method 
 evaluatePredicateRange.
 Old code:
 - Object baseObj = predicate.getLiteral(PredicateLeaf.FileFormat.ORC);
 - Object minValue = getConvertedStatsObj(min, baseObj);
 - Object maxValue = getConvertedStatsObj(max, baseObj);
 - Object predObj = getBaseObjectForComparison(baseObj, minValue);
 New code:
 + Object baseObj = predicate.getLiteral();
 + Object minValue = getBaseObjectForComparison(predicate.getType(), min);
 + Object maxValue = getBaseObjectForComparison(predicate.getType(), max);
 + Object predObj = getBaseObjectForComparison(predicate.getType(), baseObj);
 The values for min and max are of type String which contain as many 
 characters as the CHAR column indicated. For example if the type is CHAR(10), 
 and the row has value 1, the value of String min is 1 ;
 Before Hive 1.2, the method getConvertedStatsObj would call 
 StringUtils.stripEnd(statsObj.toString(), null); which would remove the 
 trailing spaces from min and max. Later in the compareToRange method, it was 
 able to compare 1 with 1.
 In Hive 1.2 with the use getBaseObjectForComparison method, it simply returns 
 obj.String if the data type is String, which means minValue and maxValue are 
 still 1 .
 As a result, the compareToRange method will return a wrong value 
 (1.compareTo(1 )  -9 instead of 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2015-06-30 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608893#comment-14608893
 ] 

Thomas Friedrich commented on HIVE-10616:
-

The TypeInfoUtils class is documented as part of the public APIs 
https://hive.apache.org/javadocs/r1.2.1/api/index.html. I don't see where the 
class is marked as internal. 
The ParseUtils class, method getDecimalTypeTypeInfo handles the case correctly:
  int precision = HiveDecimal.USER_DEFAULT_PRECISION;
  int scale = HiveDecimal.USER_DEFAULT_SCALE;

  if (node.getChildCount() = 1) {
String precStr = node.getChild(0).getText();
precision = Integer.valueOf(precStr);
  }

  if (node.getChildCount() == 2) {
String scaleStr = node.getChild(1).getText();
scale = Integer.valueOf(scaleStr);
  }

Why can't the TypeInfoUtils have the same behavior instead of returning the 
wrong values.

 TypeInfoUtils doesn't handle DECIMAL with just precision specified
 --

 Key: HIVE-10616
 URL: https://issues.apache.org/jira/browse/HIVE-10616
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.0.0
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor
 Attachments: HIVE-10616.1.patch


 The parseType method in TypeInfoUtils doesn't handle decimal types with just 
 precision specified although that's a valid type definition. 
 As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
 decimal(10,0) for any decimal(precision) string. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2015-06-30 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607568#comment-14607568
 ] 

Thomas Friedrich commented on HIVE-10616:
-

The CLI has no issue, that's why I opened the defect against the 
Serializers/Deserializers component.
We are using the TypeInfoUtils class (package 
org.apache.hadoop.hive.serde2.typeinfo) in one of our own serdes to parse the 
datatype and call the static method getTypeInfoFromTypeString which should 
treat the decimal data type according to the spec.
However, if you pass in a string for the decimal type with just one argument, 
the precision is always 10 and the scale is 0 because the parseType method 
doesn't handle the one argument decimal type.

You can run a simple test like this:
TypeInfo t = TypeInfoUtils.getTypeInfoFromTypeString(decimal(20));  
DecimalTypeInfo dti = (DecimalTypeInfo)t;
System.out.println(dti.getPrecision());  // prints 10
System.out.println(dti.getScale());   // prints 0




 TypeInfoUtils doesn't handle DECIMAL with just precision specified
 --

 Key: HIVE-10616
 URL: https://issues.apache.org/jira/browse/HIVE-10616
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.0.0
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor
 Attachments: HIVE-10616.1.patch


 The parseType method in TypeInfoUtils doesn't handle decimal types with just 
 precision specified although that's a valid type definition. 
 As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
 decimal(10,0) for any decimal(precision) string. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-3217) Implement HiveDatabaseMetaData.getFunctions() to retrieve registered UDFs.

2015-06-03 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich resolved HIVE-3217.

Resolution: Duplicate

The getFunctions method in HiveDatabaseMetaData was implemented for HS2 with 
HIVE-2935.

 Implement HiveDatabaseMetaData.getFunctions() to retrieve registered UDFs. 
 ---

 Key: HIVE-3217
 URL: https://issues.apache.org/jira/browse/HIVE-3217
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Richard Ding
 Attachments: HIVE-3217.patch


 Hive JDBC support currently throws UnsupportedException when getFunctions() 
 is called. Hive CL provides a SHOW FUNCTIONS command to return the names of 
 all registered UDFs. By getting a SQL Statement from the connection, 
 getFunctions can execute( SHOW FUNCTIONS) to retrieve all the registered 
 functions (including those registered through create temporary function).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-4402) Support UPDATE statement

2015-06-03 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich resolved HIVE-4402.

Resolution: Duplicate

Insert, update, and delete in Hive were implemented in Hive 0.14.0 with 
HIVE-5317.

 Support UPDATE statement 
 -

 Key: HIVE-4402
 URL: https://issues.apache.org/jira/browse/HIVE-4402
 Project: Hive
  Issue Type: New Feature
Reporter: Bing Li

 It would be good if hive could support UPDATE statement like common database.
 e.g. 
 update row into database (for edit rows and save back)
  
 cmd:
update DB2ADMIN.EMP set SALARY=? where EMPNO=? and DEPTNO=? and 
 SALARY=?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2015-05-05 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10616:

Attachment: HIVE-10616.1.patch

 TypeInfoUtils doesn't handle DECIMAL with just precision specified
 --

 Key: HIVE-10616
 URL: https://issues.apache.org/jira/browse/HIVE-10616
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.0.0
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor
 Attachments: HIVE-10616.1.patch


 The parseType method in TypeInfoUtils doesn't handle decimal types with just 
 precision specified although that's a valid type definition. 
 As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
 decimal(10,0) for any decimal(precision) string. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-05-05 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: HIVE-10613.1.patch

 HCatSchemaUtils getHCatFieldSchema should include field comment
 ---

 Key: HIVE-10613
 URL: https://issues.apache.org/jira/browse/HIVE-10613
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.0.0
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor
 Attachments: HIVE-10613.1.patch


 HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
 HCatFieldSchema. Instead of initializing the comment property from the 
 FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10567) partial scan for rcfile table doesn't work for dynamic partition

2015-04-30 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich reassigned HIVE-10567:
---

Assignee: Thomas Friedrich  (was: Chaoyu Tang)

 partial scan for rcfile table doesn't work for dynamic partition
 

 Key: HIVE-10567
 URL: https://issues.apache.org/jira/browse/HIVE-10567
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.14.0, 1.0.0
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor
  Labels: rcfile
 Attachments: HIVE-10567.1.patch


 HIVE-3958 added support for partial scan for RCFile. This works fine for 
 static partitions (for example: analyze table analyze_srcpart_partial_scan 
 PARTITION(ds='2008-04-08',hr=11) compute statistics partialscan).
 For dynamic partition, the analyze files with an IOException 
 java.io.IOException: No input paths specified in job:
 hive ANALYZE TABLE testtable PARTITION(col_varchar) COMPUTE STATISTICS 
 PARTIALSCAN;
 java.io.IOException: No input paths specified in job
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:459)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10567) partial scan for rcfile table doesn't work for dynamic partition

2015-04-30 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14522619#comment-14522619
 ] 

Thomas Friedrich commented on HIVE-10567:
-

Chaoyu, I attached a proposed patch. The problem is in method 
getInputPathsForPartialScan in class GenMapRedUtils.java. 
I added the case for DYNAMIC_PARTITION, but wasn't sure about the 
aggregrationKey. In the current patch, the aggregationKey is just the table 
name and the PartialScanMapper will join this with the task id which is 
different for each partition (one task per partition):
org.apache.hadoop.hive.ql.stats.fs.FSStatsPublisher: Writing stats in it : 
{default.testtable/00/={numRows=2, rawDataSize=16}}
org.apache.hadoop.hive.ql.stats.fs.FSStatsPublisher: Writing stats in it : 
{default.testtable/01/={numRows=1, rawDataSize=8}}
The output seems ok to me. 
Do you know whether the aggregationKey should be set to a different value, like 
in the STATIC_PARTITION case?

I would like to add a unit test for this case as well, that's why I didn't 
submit the patch yet.

 partial scan for rcfile table doesn't work for dynamic partition
 

 Key: HIVE-10567
 URL: https://issues.apache.org/jira/browse/HIVE-10567
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.14.0, 1.0.0
Reporter: Thomas Friedrich
Assignee: Chaoyu Tang
Priority: Minor
  Labels: rcfile
 Attachments: HIVE-10567.1.patch


 HIVE-3958 added support for partial scan for RCFile. This works fine for 
 static partitions (for example: analyze table analyze_srcpart_partial_scan 
 PARTITION(ds='2008-04-08',hr=11) compute statistics partialscan).
 For dynamic partition, the analyze files with an IOException 
 java.io.IOException: No input paths specified in job:
 hive ANALYZE TABLE testtable PARTITION(col_varchar) COMPUTE STATISTICS 
 PARTIALSCAN;
 java.io.IOException: No input paths specified in job
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:459)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10567) partial scan for rcfile table doesn't work for dynamic partition

2015-04-30 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10567:

Attachment: HIVE-10567.1.patch

 partial scan for rcfile table doesn't work for dynamic partition
 

 Key: HIVE-10567
 URL: https://issues.apache.org/jira/browse/HIVE-10567
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.14.0, 1.0.0
Reporter: Thomas Friedrich
Assignee: Chaoyu Tang
Priority: Minor
  Labels: rcfile
 Attachments: HIVE-10567.1.patch


 HIVE-3958 added support for partial scan for RCFile. This works fine for 
 static partitions (for example: analyze table analyze_srcpart_partial_scan 
 PARTITION(ds='2008-04-08',hr=11) compute statistics partialscan).
 For dynamic partition, the analyze files with an IOException 
 java.io.IOException: No input paths specified in job:
 hive ANALYZE TABLE testtable PARTITION(col_varchar) COMPUTE STATISTICS 
 PARTIALSCAN;
 java.io.IOException: No input paths specified in job
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:459)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)